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Preface 


Sources in the Development of Mathematics: Series and Products from the Fifteenth to 
the Twenty-first Century, my book of 2011, was intended for an audience of graduate 
students or beyond. However, since much of its mathematics lies at the foundations of 
the undergraduate mathematics curriculum, I decided to use portions of my book as the 
text for an advanced undergraduate course. I was very pleased to find that my curious 
and diligent students, of varied levels of mathematical talent, could understand a good 
bit of the material and get insight into mathematics they had already studied as well 
as topics with which they were unfamiliar. Of course, the students could profitably 
study such topics from good textbooks. But I observed that when they read original 
proofs, perhaps with gaps or with slightly opaque arguments, students gained very 
valuable insight into the process of mathematical thinking and intuition. Moreover, the 
study of the steps, often over long periods of time, by which earlier mathematicians 
refined and clarified their arguments revealed to my students the essential points at the 
crux of those results, points that may be more difficult to discern in later streamlined 
presentations. As they worked to understand the material, my students witnessed the 
difficulty and beauty of original mathematical work, and this was a source of great 
enjoyment to many of them. I have now thrice taught this course, with extremely 
positive student response. 

In order for my students to follow the foundational mathematical arguments 
in Sources, I was often required to provide additional material, material actually 
contained in the original works of the mathematicians being studied. I therefore 
decided to expand my book, as a second edition in two volumes, to make it more 
accessible to readers, from novices to accomplished mathematicians. This second 
edition contains about 250 pages of new material, including more details within the 
original proofs, elaborations and further developments of results, and additional results 
that may give the reader a better perspective. Furthermore, to give the material greater 
focus, I have limited this second edition to the topics of series and products, areas that 
today permeate both applied and pure mathematics; the second edition is thus entitled 
Series and Products in the Development of Mathematics. 
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This first volume of my work discusses the development of the fundamental though 
powerful and essential methods in series and products that do not employ complex 
analytic methods or sophisticated machinery such as Fourier transforms. Much of 
this material would be accessible, perhaps with guidance, to advanced undergraduate 
students. The second volume deals with more recent work and requires considerable 
mathematical background. For example, in volume 2, I discuss Weil’s 1949 paper on 
solutions of equations in finite fields and de Branges’s conquest of the Bieberbach 
conjecture. Each volume contains the same complete bibliography. 

The exercises at the end of the chapters present many additional original results and 
may be studied simply for the supplementary theorems they contain. The exercises 
are accompanied by references to the original works, as an aid to further research. 
Readers may attempt to prove the results in the problems and, by use of the references, 
compare their own solutions with the originals. Moreover, many of the exercises can 
be tackled by methods similar to those given in the text, so that some exercises can be 
realistically assigned to a class as homework. I assigned many exercises to my classes, 
and found that the students enjoyed and benefited from their efforts to find solutions. 
Thus, the exercises may be useful as problems to be solved, and also for the results 
they present. 

Detailed study of original mathematical works provides a point of entry into the 
minds of the creators of powerful theories, and thus into the theories themselves. 
But tracing the discovery and evolution of mathematical ideas and theorems entails 
the examination of many, many papers, letters, notes, and monographs. For example, 
in this work I have discussed the work of more than three hundred mathematicians, 
including arguments and theorems contained in approximately one hundred works and 
letters of Euler alone. Locating, studying, and grasping the interconnections among 
such original works and results is a ponderous, complex, and rewarding effort. In this 
second edition, I have added numerous footnotes and almost five hundred works to the 
bibliography. My hope is that the detailed footnotes and the expanded bibliography, 
containing both original works and works of distinguished expositors and historians 
of mathematics, may encourage and facilitate the efforts of those who wish to search 
out and study the original sources of our inherited mathematical wealth. 

I first wish to thank my wife, who typeset and edited this work, made innumerable 
corrections and refinements to the text, and devotedly assisted me with translations and 
locating references. I am also very grateful to NFN Kalyan for his encouragement and 
for creating the eloquent artwork for the cover of these volumes. I greatly appreciate 
Maitreyi Lagunas’s unflagging support and interest. I thank Bruce Atwood who 
cheerfully constructed the nice diagrams contained in this work, and Paul Campbell 
who generously provided expert technical support and advice. I am grateful to 
my student Shambhavi Upadhyaya, who has an unusual ability to proofread very 
accurately, for spending so much time giving useful suggestions for improvement. I 
am indebted to my students whose questions and enthusiasm helped me refine this 
second edition. I also thank the very capable librarians at Beloit College, especially 
Chris Nelson and Cindy Cooley. Finally, I wish to acknowledge the inspiration 
provided me by my friend, the late Dick Askey. 
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Preface 


Sources in the Development of Mathematics: Series and Products from the Fifteenth to 
the Twenty-first Century, my book of 2011, was intended for an audience of graduate 
students or beyond. However, since much of its mathematics lies at the foundations of 
the undergraduate mathematics curriculum, I decided to use portions of my book as the 
text for an advanced undergraduate course. I was very pleased to find that my curious 
and diligent students, of varied levels of mathematical talent, could understand a good 
bit of the material and get insight into mathematics they had already studied as well 
as topics with which they were unfamiliar. Of course, the students could profitably 
study such topics from good textbooks. But I observed that when they read original 
proofs, perhaps with gaps or with slightly opaque arguments, students gained very 
valuable insight into the process of mathematical thinking and intuition. Moreover, the 
study of the steps, often over long periods of time, by which earlier mathematicians 
refined and clarified their arguments revealed to my students the essential points at the 
crux of those results, points that may be more difficult to discern in later streamlined 
presentations. As they worked to understand the material, my students witnessed the 
difficulty and beauty of original mathematical work, and this was a source of great 
enjoyment to many of them. I have now thrice taught this course, with extremely 
positive student response. 

In order for my students to follow the foundational mathematical arguments 
in Sources, I was often required to provide additional material, material actually 
contained in the original works of the mathematicians being studied. I therefore 
decided to expand my book, as a second edition in two volumes, to make it more 
accessible to readers, from novices to accomplished mathematicians. This second 
edition contains about 250 pages of new material, including more details within the 
original proofs, elaborations and further developments of results, and additional results 
that may give the reader a better perspective. Furthermore, to give the material greater 
focus, I have limited this second edition to the topics of series and products, areas that 
today permeate both applied and pure mathematics; the second edition is thus entitled 
Series and Products in the Development of Mathematics. 


Xvii 


XViil Preface 


This first volume of my work discusses the development of the fundamental though 
powerful and essential methods in series and products that do not employ complex 
analytic methods or sophisticated machinery such as Fourier transforms. Much of 
this material would be accessible, perhaps with guidance, to advanced undergraduate 
students. The second volume deals with more recent work and requires considerable 
mathematical background. For example, in volume 2, I discuss Weil’s 1949 paper on 
solutions of equations in finite fields and de Branges’s conquest of the Bieberbach 
conjecture. Each volume contains the same complete bibliography. 

The exercises at the end of the chapters present many additional original results and 
may be studied simply for the supplementary theorems they contain. The exercises 
are accompanied by references to the original works, as an aid to further research. 
Readers may attempt to prove the results in the problems and, by use of the references, 
compare their own solutions with the originals. Moreover, many of the exercises can 
be tackled by methods similar to those given in the text, so that some exercises can be 
realistically assigned to a class as homework. I assigned many exercises to my classes, 
and found that the students enjoyed and benefited from their efforts to find solutions. 
Thus, the exercises may be useful as problems to be solved, and also for the results 
they present. 

Detailed study of original mathematical works provides a point of entry into the 
minds of the creators of powerful theories, and thus into the theories themselves. 
But tracing the discovery and evolution of mathematical ideas and theorems entails 
the examination of many, many papers, letters, notes, and monographs. For example, 
in this work I have discussed the work of more than three hundred mathematicians, 
including arguments and theorems contained in approximately one hundred works and 
letters of Euler alone. Locating, studying, and grasping the interconnections among 
such original works and results is a ponderous, complex, and rewarding effort. In this 
second edition, I have added numerous footnotes and almost five hundred works to the 
bibliography. My hope is that the detailed footnotes and the expanded bibliography, 
containing both original works and works of distinguished expositors and historians 
of mathematics, may encourage and facilitate the efforts of those who wish to search 
out and study the original sources of our inherited mathematical wealth. 

I first wish to thank my wife, who typeset and edited this work, made innumerable 
corrections and refinements to the text, and devotedly assisted me with translations and 
locating references. I am also very grateful to NFN Kalyan for his encouragement and 
for creating the eloquent artwork for the cover of these volumes. I greatly appreciate 
Maitreyi Lagunas’s unflagging support and interest. I thank Bruce Atwood who 
cheerfully constructed the nice diagrams contained in this work, and Paul Campbell 
who generously provided expert technical support and advice. I am grateful to 
my student Shambhavi Upadhyaya, who has an unusual ability to proofread very 
accurately, for spending so much time giving useful suggestions for improvement. I 
am indebted to my students whose questions and enthusiasm helped me refine this 
second edition. I also thank the very capable librarians at Beloit College, especially 
Chris Nelson and Cindy Cooley. Finally, I wish to acknowledge the inspiration 
provided me by my friend, the late Dick Askey. 


Power Series in Fifteenth-Century Kerala 


1.1 Preliminary Remarks 


More than two and a half centuries before Isaac Newton discovered the sine and cosine 
series and James Gregory the arctan series, the Indian astronomer and mathematician 
Madhava (c. 1340-c. 1425) gave expressions for sin x, cos x, and arctan x as infinite 
power series.! Madhava’s work may have been motivated by his studies in astronomy, 
since he concentrated mainly on the trigonometric functions. There appears to be 
no connection between the work of Madhava’s school and that of Newton and other 
European mathematicians. In spite of this, the Keralese and European mathematicians 
shared some similar methods and results. Both were fascinated with transformation of 
series, though they used very different methods. 

The mathematician-astronomers of medieval Kerala lived, worked, and taught in 
large family compounds called illams. Madhava, believed to have been the founder of 
the school, worked in the Bakulavihara illam in the town of Sangamagrama, a few 
miles north of Cochin. He was an Emprantiri Brahmin, then considered socially 
inferior to the dominant Namputiri (or Nambudri) Brahmin. This position does not 
appear to have curtailed his teaching activities; his most distinguished pupil was 
Paramesvara, a Namputiri Brahmin. No mathematical works of Madhava have been 
found, though three of his short treatises on astronomy are extant. The most important 
of these describes how to accurately determine the position of the moon at any 
time of the day. Other surviving mathematical works of the Kerala school attribute 
many very significant results to Madhava. Although his algebraic notation was almost 
primitive, Madhava’s mathematical skill allowed him to carry out highly original and 
difficult research. 

Paramesvara (c.1380—c.1460), Madhava’s pupil, was from Asvattagram, about 
thirty-five miles northeast of Madhava’s home town. He belonged to the Vatasreni 
illam, a famous center for astronomy and mathematics. He made a series of 
observations of the eclipses of the sun and the moon between 1395 and 1432 and 
composed several astronomical texts, the last of which was written in the 1450s, 


! Newton (1959-1960) vol. 2, pp. 20-47, especially p. 36; Turnbull (1939) p. 170; Jyesthadeva et al. (2008). 
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near the end of his life. Sankara Variyar attributed to Paramesvara a formula for the 
radius of a circle in terms of the sides of an inscribed quadrilateral. See Exercise 
4. Paramesvara’s son, Damodara, was the teacher of Jyesthadeva (c. 1500-c. 1570) 
whose works survive and give us all the surviving proofs of this school. Damodara was 
also the teacher of Nilakantha (c. 1450-c. 1550) who composed the famous treatise 
called the Tantrasangraha (c. 1500), a digest of the mathematical and astronomical 
knowledge of his time. His works allow us determine his approximate dates because, 
in his Aryabhatyabhasya, Nilakantha refers to his observation of solar eclipses in 1467 
and 1501. Nilakantha made several efforts to establish new parameters for the mean 
motions of the planets and vigorously defended the necessity of continually correcting 
astronomical parameters on the basis of observation. Sankara Variyar (c. 1500-1560) 
was his student. 

The surviving texts containing results on infinite series are Nilakantha’s Tantrasan- 
graha, a commentary on it by Sankara Variyar called Yuktidipika, the Yuktibhasa by 
Jyesthadeva and the Kriyakramakari, started by Variyar and completed by his student 
Mahisamangalam Narayana. In addition, there is a text called Karanapaddhati of 
Putumana Somayaji, thought by some to have been written around 1700. However, 
the four translators of this work present an argument that Somayaji was a junior 
contemporary of Nilakantha and composed his work between 1532 and 1566. All these 
works are in Sanskrit except the Yuktibhasa, written in Malayalam, the language of 
Kerala. These works, especially the Yuktibhasa that gives detailed arguments, provide 
a summary of major results on series discovered by these original mathematicians of 
the indistinct past: 


A. Series expansions for arctangent, sine, and cosine: 


(1) 6 =tan 6 a t ae sod 
(2) snd =0-F+E----, 

2 4 
(3) os@=1-5+4--.., 


sD, ed 04 | 0° 08 He dhe 
@) sin"? = 0" = yaa + @iatyaiady) CD SeLD - 


In the proofs of the formulas contained in (A), the range of @ for the first 


series was 0 < 6 < 7 and for the second and third was 0 < 6 < a 
Although the series for sine and cosine converge for all real values, the concept 


of periodicity of the trigonometric functions was discovered much later. 


B. Series for zr: 


() %x1—44+4—---FL4fi(n4+1), 1 =1,2,3, where 


1 n 
fi@) = an’ fxn) = 2(n2 + 1)’ 


and 


n?>+4 


BM) = n(n? +5)’ 
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x _ 3 1 te ged 

aa 33-3 533-5 | BAT s 
au 4 4 jl 4 

(3) 4 ~~ 15441 9 3544.3 5 5544-5 ? 

q _ ere ae 

(4) 2/3 =l-s3+ 53 — 7334 , 

(5) 2 = 1 1 ; 1 1 be es 
6 2 ' (2.2?—1)2—2? ' (2.44-1)2-42 " (2.62=1)2-6 ! ; 
m2 ~,_1 ee eee! _ 1 1 : 

(6) “| 2-1 9 42-1 ' 62-1 F 1 © aGeeH2)' 
Tote Bem, dh. = VG a A ee 

O g=x75 tant ; 

x _ 1 1 1 1 

(8) 8 2 42] 82-1 122-1 


These results were stated in verse form. Thus, the series for sine was described: 


The arc is to be repeatedly multiplied by the square of itself and is to be divided [in order] by the 
square of each even number increased by itself and multiplied by the square of the radius. The 
arc and the terms obtained by these repeated operations are to be placed in sequence in a column, 
and any last term is to be subtracted from the next above, the remainder from the term then next 
above, and so on, to obtain the jya (sine) of the arc. 


So if r is the radius and s the arc, then the successive terms of the repeated 
operations mentioned in the description are given by 


s? s? s? 


“tar? * aD? @+4Hr2" 


and the equation is 


s? s? s? 


24 Dr? Dr 4H 


y=s , 
where y =r sin®. 

Nilakantha’s Aryabhatyabhasya attributes the sine series to Madhava. The 
Kriyakramakari attributes to Madhava the first two cases of (B.1), the arctangent 
series, and series (B.4); note that (B.4) can be derived from the arctangent by taking 
6 = &. The extant manuscripts do not appear to attribute the other series to a particular 
person. The Yuktidipika gives series (B.6), including the remainder; it is possible that 
this series is due to Sankara Variyar, the author of the work. Series (B.7) and (B.8) 
are mentioned in the Yuktibhasa and are easily transformable into series (A.1) with 
6 = 7. We can safely conclude that the power series for arctangent, sine, and cosine 
were obtained by Madhava. 

The series for sin’, (A.4), follows directly from the series for cos @ by an 
application of the double angle formula, sin?6 = 5(1 — cos 26). The series for 7, 
(B.1), has several points of interest. When n — oo, it is simply the series discovered 
by Leibniz in 1673, that he communicated to Newton.* However, this series is not 
useful for computational purposes because it converges extremely slowly. To make 
it more effective in this respect, Madhava added a rational approximation for the 


7 Rajagopal and Rangachari (1977) p. 96. 
3 Newton (1959-1960) vol II, pp. 57-71, especially p. 67. 
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remainder after n terms. We present Jyesthadeva’s derivation for the expressions fj (7) 
and f2(n) in (B.1) later in this chapter. However, if we set 


ae ene lpn ty 1.1) 
Ae rg gE eee mec e “ 


then the remainder f(n) has the continued fraction expansion 


1: So 2 ae 


isc) SS ada, 1.2 
fin+ 1) ok ae ee (1.2) 
where f(n + 1) satisfies the functional relation 
1 
Pepe Mise fatal (1.3) 
The first three convergents of this continued fraction are 
LC (ee, eat ence 
— + 1), — , an - —.——_ = . 
7 ak Trae Anes 
(1.4) 


Although this continued fraction is not mentioned in any extant works of the Kerala 
school, their approximants indicate that they must have known it, at least implicitly. 
In fact, continued fractions appear in much earlier Indian works. The Lilavati of 
Bhaskara (c. 1150) used continued fractions to solve first-order Diophantine equations 
and Variyar’s Kriyakramakari was a commentary on Bhaskara’s book. 

The approximation in equation (B.6) is similar to that in (B.1) and gives further 
evidence that the Kerala mathematicians saw a connection between series and 
continued fractions. If we write 


mam —2 1 1 1 1 


= vet + + 1), 1.5 
Un a a eed fot oe 
then 
t! Ay 290-953-364 
BU oat oar ar eae (1.6) 
nh no nr n> n+ 
and 
ose (n) = : (1.7) 
gin = 55° §2(n = FG LD): . 


Newton, who was very interested in the numerical aspects of series, also found the 
fi@) = + approximation when he saw Leibniz’s series. He wrote in a letter in 1676+ 
to Henry Oldenburg: 


By the series of Leibniz also if half the term in the last place be added and some other like device 
be employed, the computation can be carried to many figures. 


4 Newton (1959-1960) vol. 2, pp. 110-149, especially p. 140. 


1.2 Transformation of Series 5 


Though the accomplishments of Madhava and his followers are quite impressive, 
the members of the school do not appear to have had any interaction with people 
outside of the very small region where they lived and worked. By the end of the 
sixteenth century, the school ceased to produce any further original works. Thus, there 
appears to be no continuity between the ideas of the Kerala scholars and those outside 
India or even from other parts of India. 


1.2. Transformation of Series 


The series in equations (B.2) and (B.3) are transformations of 


Sel 
ie 
k= 1 


by means of the rational approximations for the remainder. To understand this 
transformation in modern notation, observe: 


1 1 
<= d—- fi) - (5 — fi) — fic) a (5 sae GD oe fi) a ae LS) 


The (n + 1)th term in this series is 


1 1 1 —1 
sr Ain) — fin + 2) = = 3 
2n+1 2n+1 4n 4m4+1) (Qn+1)?—-Qn+1) 
(1.9) 
Thus, we arrive at equation (B.2). Equation (B.3) is similarly obtained: 
wv 1 1 
ra (1 — fo(2)) — ao BQ) = f2@)) + 57 f2(4) — f26)} —---, (1.10) 
and here the (n + 1)th term is 
1 n n+1 4 
= : (1.11) 
Qn+1 (2n)?+1 (Qn4+2)?+1 (2n+1)34+4(2n+4+1) 
Clearly, the nth partial sums of these two transformed series can be written as 
i(n) = 1 Pe pss + fin), i=1,2 (1.12) 
s(n) = 375 57 a ce Pa j(4n), 1= 1,2. % 


Since series (1.8) and (1.10) are alternating, and the absolute values of the terms 
are decreasing, it follows that 
1 1 \* 
< 
(Q2n+1)3—(Qn+1) (2n+4+3)3—(2n+4+3) 4 
1 
<— . 
(2n + 1)3 — (Qn + 1) 


si(n) 


(1.13) 
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Also 
4 4 A 
< | s(n) 
(Qn+1)5+4(2n+1) (Q2n+3)°+4(2n 43) 4 
4 
2 : (1.14) 
(2n + 1)° + 4(2n + 1) 

Thus, taking fifty terms of 1 — ; + ; —--- and using the approximation f2(7), 


the last inequality shows that the error in the value of 7 becomes less than 4 x 1071. 
The Leibniz series with fifty terms is normally accurate in computing z up to only 
one decimal place; by contrast, the Keralese method of rational approximation of the 
remainder produces numerically useful results. 


1.3 Jyesthadeva on Sums of Powers 


The Sanskrit texts of the Kerala school with few exceptions contain merely the 
statements of results without derivations. It is therefore extremely fortunate that 
Jyesthadeva’s Malayalam text Yuktibhasa, containing the methods for obtaining the 
formulas, has survived. Sankara Variyar’s Yuktidipika is a modified Sanskrit version 
of the Yuktibhasa. It seems that the Yuktibhasa was the text used by Jyesthadeva’s 
students at his illam. From this, one may surmise that Variyar, a student of Nilakantha, 
also studied with Jyesthadeva whose illam was very close to that of Nilakantha. 
A basic result used by the Kerala school in the derivation of their series is that 


— 


: 1 ag 


Jyesthadeva gave an inductive proof of this result.> He noted 


5) = 


He then observed that for large n, Ar ~ gp and hence 


1 
50 wn? — Son Sm Fn? (1.16) 


Now 


SO =n +a—1P+--4P 


5 Jyesthadeva et al. (2008) pp. 192-196. 


1.3 Jyesthadeva on Sums of Powers 
and 


nS) =n(n+(n-D+---+1), 
so that 


nS) — s2) =1.(n 


1) +2(n —2)+3(n—3)+---+(n—-1):-1 
=(n—-1)+(n-2)4+(1—-3)4+---4+1 
+ (n —2)+ (n—3)4 + 1 
t (n — 3)4 + 1 
QM 1 o@) 1 oD | L of) 
By using (1.16) in (1.17), Jyesthadeva had, for large n, 
)_ 5! 2 2 2 49 
ns,’ — So ~(n— 1) += (1-2) +-(n—-3)° +--+: + 2-1 
2 2 
1 
Bs 52), (1.18) 


Note that for large n, S 2) 


n hee was used to obtain (1.18). Again, by applying 
(1.16) in (1.18), he had 


52) x An, 


1 
= 1.19 
3 (1.19) 
Next 


nSO — $9 =1-@=-1)P +2-@-2P +--+ (n 


1-1? 


Applying equation (1.19) yielded 


or 


Jyesthadeva next presented the general principle of summation, that we may 
express within our notation as: 


Let 


sh =nk+(n 
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and suppose that se) has been estimated to be 


1 
SOD xe Lak, 


= 


Then 


ns) — SW = si) 4 sD 4. SD (1.20) 


hence 
1 
SO x nkt! (1.21) 


and this proved (1.15). 

As noted, Jyesthadeva did not write formulas in the symbolic form we have used. 
Rather, he gave verbal descriptions of his relations and formulas. His application of 
induction is very clearly executed. He writes that the case k = 1 implies the case 
k = 2, that in turn implies the case k = 3; in the same manner, a higher value of 
k will imply the next value, and so on.° Also note that formula (1.20) was known to 
al-Haytham (965-1039) for k = 1 to k = 4, but who most probably knew that it could 
be generalized, though he did not do it explicitly.’ 


1.4 Arctangent Series in the Yuktibhasa 


The derivation of the arctangent series,* as given by Jyesthadeva, boils down to the 
integration of + 15 , aS do the methods of Gregory and Leibniz. 

In Figure 1.1, AC is a quarter circle of radius one with center O; OABC is a square. 
The side AB is divided into n equal parts of length 6 so that nd = 1 and Pr_1 Py = 6. 
EF and Py_;D are perpendicular to O P;. Now, the triangles OE F and O Py_1 D are 
similar, implying that 


EF  Py1D 
OE OPy-| 


_ Pe-1D 
OPE: 


EF 


The similarity of the triangles Py P, D and OA Px gives 


Pei Pe Pe-1D Pr—1 Pr 
— or Py_yD= ‘ 
O Py OA O Px 


6 ibid. pp. 65-66. 
T See Katz (1995) p. 125. 
8 Jyesthadeva et al. (2008) pp. 183-191. 
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O A 


C B 


Figure 1.1 Rectifying a circle by the arctangent series. 


Thus, 
EF PriPe — Pk-1Pk = Pe-1Pk = 8 
OPk10Pk OP2 1+ AP2 1+k282" 
Now 
EG~EF : 
arc = SS 
1+ ks? 


and if we write AP, = x = tan 0, where 0 = AOPi, then 


k 
: 5 
arctan x = jim, 2 Ta pat (1.22) 
To compute this limit, Jyesthadeva expanded ; 7 sz aS a geometric series. He 


derived the series by an iterative procedure: 


1 1 1 
—l1-x =1l-x({1l-~x : 
1+x () ; (-)) 


Thus, (1.22) is converted to 


k 
arctan x = lim r) bebe es 
j=l 
k bd we 
2 4 
= im (2 1-Hyeee yy 
ira j=l j=l 
a? ze 
=x-— — 
3 a: 
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The last step follows from (1.15). Note that this is the Madhava—Gregory series for 
arctan x and the series for 7 follows by taking x = 1. 


1.5 Derivation of the Sine Series in the Yuktibhasa 


Once again, Jyesthadeva’s derivation of the sine series has similarities with Leibniz’s 
derivation of the cosine series. In Figure 1.2, suppose that AOP =0,0P =R, Pis 
the midpoint of the arc P_; P|, and PQ is perpendicular to OA, where O is the origin 
of the coordinate system. Let P = (x,y), Py = (x1, y1), and P_; = (x_1, y_1). From 
the similarity of the triangles P_; Q; P; and OPQ, we have 


PAP x11. WS 1 


= 1.23 

OP y x ( ) 

Jyesthadeva took an arc, P_;P = rR“ = 4s, small enough that he could set it 
equal to the line segment P_; P; we can then write (1.23) as 
AO AO . 

cos a — cos [0 — i = —sin 6 Ad (1.24) 

and 
: Ad . A@ 
sin | 0 + a ia sin | 0 — cs = cos 0 Ad. (1.25) 


In fact, in his Siddhanta Siromani, Bhaskara? had stated (1.25) and proved it 
in the same way; he applied it to the discussion of the instantaneous motion of 
planets. Interestingly, in the 1650s, Pascal!® used a very similar argument to show 
that {cos 6d = sin 6 and { sin 6d0 = —cos 0. 


Q 


Figure 1.2 Derivation of the sine series. 


9 Bhaskara (2010). 
10. Struik (1969) vol. 2, p. 239. 
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From (1.24) and (1.25) Jyesthadeva derived the result, given in modern notation: 


0 et 6 
sin 6-0 = -| / sin ududt = -| (1 — cos t)dt. (1.26) 
0 JO 0 


We also note that Leibniz found the series for cosine using a similar method of 
repeated integration.'! In Jyesthadeva, the integrals are replaced by sums and double 
integrals by sums of sums. The series is then obtained by using successive polynomial 
approximations for sin 6. For example, when the first approximation sin u ~ u is used 
in the right-hand side of (1.26), the result is 

0° 03 
SPU OR ag or Sales 


When this approximation is employed in (1.26), we obtain 


=, °° 
sin 6 —-O0~ 31 51° 
Briefly, Jyesthadeva arrived at the sums approximating (1.26) by first dividing AP 
into n equal parts using division points P), Po, ..., P,-1. Denote the midpoint of the 


arc P,_1 Pr as Py. Then by (1.23) and using As = RAO 


As 
Xe) — 1 = Ze k=1,2,...,n—1. (1.27) 


We also have 


As 
e+ — Yk) — Oe — Ye-1) = = cae =p si 


or 


As\? 
Yeti — 2K + Ye-1 --(%) Vee KH 12 wT, (1:29) 


Now in (1.29), start with k = n — 1 and multiply the equations by 1,2,...,n — 1 
respectively and sum up the resulting equations. We then have 


As 7 
Yn — ny} --(%) (Yn—1 + 2yn—2 + +++ (n — 1)y1) 
As a 
= (=) (yt + G1 + y2) Fee + 1 + y2+-++Yn-1)), 1.30) 


the result corresponding to (1.26). To obtain the successive polynomial approxima- 
tions, Jyesthadeva had to work with sums of powers of integers; in order to deal with 
these sums, he applied the same lemma (1.15) he had used for the arctangent series. 


'1 Newton (1959-1960) vol. 2, p. 74. 
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Observe that (1.30) involves a sum of sums; in fact, Jyesthadeva’s work has a 
section devoted to the topic of repeated summation.!* Since this is a topic we shall 
see repeatedly in this book, we explain how Jyesthadeva dealt with it. Denote the sum 
of the first n natural numbers by 


VO =ntn—-14---+241, 
a sum whose value was given by Jyesthadeva as 


n(n + 1) 


vO — 
. ing 


He remarked that this result was easier to understand by making a diagram with 
a shaded square in the first row, two shaded squares on the second row, three in the 
third row, and so on, with the last row containing as many shaded squares as terms. A 
diagram with n = 4: 


With n rows of squares, one may see that the result is 5n(n +1). 
Jyesthadeva next noted, with a brief geometric argument, that the second summa- 
tion was 


V2) = VD 4 yD 4 4 yi? 
_ nnt+1) ; (n—1)n 1.2 


2 2: 2 
= ETA E (1.31) 
1-2-3 


writing that the successive summations continued in the same manner. Thus, in 
modern notation we have 


VO) = yD 4 ye pee ye 


a GE Dita) aek) (1.32) 
LPs ate 


We note that (1.32) is an important formula; it occurs in the earlier Ganita 
Kaumudi of Naryana Pandita (c. 1350). It is also given in Zhu Shijie’s 1303 Siyuan 
Yujian.'> This formula was rediscovered by several European mathematicians in the 


12 Jyesthadeva et al. (2009) pp. 226-228. 
13, Hoe (2007) p. 400. 
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seventeenth century. We note a method of proof given by Nicole!* and others. Taking 
(1.32) as the definition of v6, we have 


JG+tD::--G+tk-DG+A) G-Dj---Gtk-D) 
1-2---k-(kK+1) eee ea ene 
nah LOO aly oh. 
iG+)---G+k- 


= oGaE =e j=1,2,---,n. (1.33) 


) _ y® _ 
Ve ave = 


By a repeated application of (1.33), one obtains 
VO GY eee 
k (kK) ) | (k) (kK) ) | | (k) (k) 
= (v, Vii) AVE Mag) eee, v,") 
a vo = A (1.34) 


and since ye = Veo = 1, the result is proved. Jyesthadeva observed that for 
large n, 


nktl 


V,) a ———____ 
no 72+) 


(1.35) 


We now pick up the thread of Jyesthadeva’s derivation of the sine and cosine series; 
in the course of this derivation, he employed the versine function vers 6, defined by 
versine 0 = 1 — cos @. See equation (1.26). Thus, referring to Figure 1.2, 


AQ =z= Rversé = R— Roos. 


With x denoting the R cos and y the R sin values in (1.27) through (1.30), we can 
see that z, = R — xx; Jyesthadeva could take (1.27) as 


Zeb f= Be KHL. nL (1.36) 


Adding the n — 1 equations produced 


A 


S 
Lop (Yn—1 + yn-2 +--+ + y1). 


z z 


1 
ane} 


Taking n to be very large, he could replace z,,_ i by z, and z 1 by zero to obtain 


-1 
As Ss L 
in = On-1t nat FID = Te D4 (1.37) 


14 Nicole (1717). 
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Now for very large n 


is 
yr (1.38) 
n 
so that 
s Ss 
tn (@—D=+@—-D= 4-42), 
but 
= (« Do aD 4 )a5 (2) a ee a ees 
nR n- n- ‘nl R\Xn 
so, taking k = 1 in (1.35), 
1 s? 
Pi GS ae Sy 1.39 


Next, by (1.38) and (1.39), and taking & = 2 in (1.35), for (1.30) Jyesthadeva had 


! s alee | 1 | 1424 | 2 | 
yn ~ S aa ( ) d-4 T rn y+d4 T rn )4 ) 
1 s 
REDS om 


here note that 


Yn = Rsin 0. (1.41) 


Equation (1.40) gave Jyesthadeva a value of yj; better than that in (1.38); he 
employed this to obtain an improved approximation for z,. Thus, he substituted 


Nes) 
js 
_is_1 () 


n R21.2-3 


in (1.37) to find that 


AY 1 Ss 3 1 . ' 
nR ke n fon 1)? + (n — 2) +++) 


Pane! s? 1 st 


~ : 1.42 
R1-2 R31-2-3-4 ce 
Finally, employing (1.40) in (1.30) yielded Jyesthadeva the result 
1 s 1 Ca 
= Rsin 0 & : 1.43 
Present ie ees RG ed) AeA rm 
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Repeating this process infinitely often would give the result 


; eB @ 6! 
sin@ = 0 af tee ay (1.44) 


completing Jyesthadeva’s derivation for the series for sin 0; clearly, the series for cos 6 
was an immediate consequence. 

Jyesthadeva’s also noted that by applying (1.35) to (1.34), for large n one would 
obtain 


n* : (n — 1)* 1k ” nkt! 
ny | nny 2 ame Soy, 
or 
k+1 
SO mw A. 1.45 
m k+1 vee) 


He thus gave a second method of proving (1.21) and thus (1.15),!> but by using 
v6 , how called pyramidal or tetrahedral numbers. As we shall see in Chapter 2, this 
method of proof of (1.45) was rediscovered by Fermat. 

The Kerala mathematicians, and indeed Fermat as well, must demand our admi- 
ration for their ability to perform such elaborate and intricate analytic calculations 
with only a very rudimentary notation at their disposal. We would do well to keep in 
mind the advantage afforded us by the mathematical power of our modern symbolic 
notation. 


1.6 Continued Fractions 


The noted twelfth-century Indian mathematician Bhaskara, who lived and worked in 
the area now known as Karnataka, used continued fractions in his c. 1150 Lilavati. The 
Kerala school was certainly familiar with Bhaskara’s work, since they commented on 
it. It is therefore possible that they were aware of the specific continued fractions 
(1.2) and (1.6) for the error terms, even though they mentioned only the first few 
convergents of these fractions. 

The Yuktibhasa indicated a method!® by which the first two approximations for 
f(+1) in (1.4) could be derived. Jyesthadeva observed that if the correction in (1.1) 
was performed after the term —- then 


=1 oe seek : + f(n—1). (1.46) 
n—2 


!5 Jyesthadeva (2008) pp. 98-99, 227. 
16 ibid. pp. 201-205. 
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Subtracting (1.46) from (1.1) would yield (1.3). Now Jyesthadeva noted that 


fa+b)= x was merely an approximation, because it implied that f(m — 1) = 


so that equation (1.3) would not be satisfied. But he next argued that it would be 


1 
2n—4? 
possible to bring the values of both f(n + 1) and f(n — 1) close to a by taking 


1 1 
Lf) Se. and iA 
f@=1) Ea, Sean fia+ 1) mpd 


Subtracting series (1.46) from (1.1) gave a measure of the error, er(n), involved in 
choosing these values: 


er(n)= f(ntl+fa—-IJ) 


ey eee. 1 
~ 2nt2' 2n-2 0 
on on an on An 4 1 


=> T = ’ 1.47 
4n3—4n  4n3—4n 4n3—4n n3—n ne 


that is essentially the same as the result obtained in (1.9). Jyesthadeva’s next step 
was to point out that the error given by (1.47) was positive, but adding 1 to 


the denominator of f(n + 1), so that fu + 1) & ah. would make the error 
negative: 
Gis 1 1 1 
se eam Re ke | n 
2n* —n . Qn? + 3n 4n? + 4n —3 
~ 4n3 +4n? —3n * 4n34+4n2—3n  4n3 + 4n2 —3n 
—2n+3 
= 1.48 
4n3 + 4n2 —3n ( 


However, the error in (1.48) was not an improvement over (1.47) because of the 
term —2n in the numerator. In order to improve upon (1.48), Jyesthadeva observed that 
a quantity less than | must be added to the denominators of f(n +1) and f(n— 1); he 
remarked that adding | to the denominators of f(m + 1) and f (nm — 1) had introduced 
an extra factor of 2 in the numerator. Thus, if the denominator of f(n + 1) in (1.47) 
were changed to 2n + 2 + Et then the contribution 2 of the error terms would 


become aps ~ 1. But the term —i added —4n — | in the numerator. By changing the 
denominator of f(n + 1) to2n+2+ 5 a. the contribution of -i would essentially 


amount to —1, and would cancel with +1 from the error terms f(m + 1) + f(m — 1). 
Thus, reasoned Jyesthadeva, he should take 


1 
an+2+ 2n+2 


fa+l) 
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The corresponding er(n) would then be given as 


y= 2+), 2@-) 
oN’ Watbe+l) 4(@—be4+l) 2 
n+l n—1 1 


~ (@24D 42n) 242-20) 2 
(n+ 1)(n? +2—2n) + (n—1)(n?+242n) 2(n*4+4) 
2(n* + 4) n(n4 + 4) 


—4 
Note that (1.49) is the same as (1.11) and leads to the series (B.3).!7 
Jyesthadeva also gave a derivation of (B.6) by taking f(n + 1) = +4, so that 


1 ae 
f(a —1) = x= and 


fotieyesS Soe 
eo ‘ n 2 2n—-4 on 
_ 1 
hee = 1: 
He next derived a third-order correction: 
fant) 
no = 4 
AT EE ores 
n+l 4 
(4) ef 


(n+ 1)? +5) 


There is a method for finding a continued fraction for f(m) that goes back to 
Wallis;!® Whiteside’s reworked form of this method is quite clear:!° Start with the 
functional equation (1.4) for f(”), 


1 
fnrmt)H)+fa-D= os (1.50) 


It is obvious that a first approximation for f(n) is given by f(n) © x. As a first 
step toward the continued fraction for f (7), set 


f(n)= sits Gide HO 5 te a (1.51) 
2r© ‘ ro) 


!7 ibid. p. 205. 
18 Wallis (2004) pp. 167-174. 
19 Whiteside (1961b) p. 212. 
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It follows from (1.50) that 7 satisfies 


Ce — n) Ge, — n) =n’. (1.52) 
From (1.51) 
2 
(0) | 
2rayy R= n+24 con 
Pat 


and a similar relation holds for 3 ,;- When these values are substituted in (1.52), some 
calculation gives us 


(2rk?, — 2) (2r, - @ $2) =n? (1.53) 
Once again, 1, rf ) ~ n. So assume rv ant oy and substitute in (1.53) to get, after 
Sn 
simplification, 
16s? 5? —2(n +.4)s, — 20 — 4)s, — 4 =0. (1.54) 


To obtain an equation such as (1.52) or (1.53), multiply (1.54) by 4, set 


(2) _ In 
Sn = 7? 
and add n? to both sides to get 
(2°, - 2-4) (2n?, - $4) =n? (1.55) 
We then have 1 = =n+ ae 
Tn 
(2) : 
re =n+— 
P 
satisfies the equation 
(279, — @-6) (272, - @46)) =n? (1.56) 
It can be shown inductively that if 
2 
OD coy a 


a) 
and 


(ne ed = (n= 2k = 1) (244 ae —(n+2k=)) =n 
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then 
(270, — @ = 20) (2718, — +2) =n? 


It follows that f(n) has the continued fraction expansion (1.2). In a similar way, we 
may obtain the continued fraction (1.6) for g() if we start with the functional relation 


1 
n2—1- 


It may be instructive to consider another method, first published by Gauss,”° for 
finding the continued fractions of the Kerala school, also a method for obtaining the 
successive convergents. There is certainly no clear indication that this method was 
discovered before Gauss did so in his work on approximate quadrature, presented in 
1814, published in 1815. Start with a series of the form 


a a a 
f@=—+ State. (1.57) 
non n 


Note that it is always possible to associate a continued fraction with (1.57) by 
applying successive division. Write (1.57) as 


1 
fa)=z 


eas ot 2k, (1.58) 
2 pee) 


a 


From this we can see that 


b} by. b 
na)= +34 34..., 
n n n 


a series of the same kind as (1.57). So the process can be continued, and the result is a 
continued fraction for f(n). To find the numbers a1, a2, a3, a4, ..., Substitute (1.57) 
in (1.50). The first four values are 

1 


1 
= ~? = 0, = em) a 0. 
a| 3 a2 a3 5) ag 


From these values, we obtain the second convergent of the continued fraction for 
J (n) by applying the process described in (1.58): 


| n 
In+2  2(n?+1)° 


By also using a5 = 3 and ae = 0, we obtain the third convergent: 7! 
n> +4 
2n(n2 +5) 


20 Gauss (1815). 
21 See Srinivasiengar (1967) pp. 149-151. 
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One problem that arises in the computation of a1, a2, a3, etc., is finding the series 
expansions of etc. Although this may appear to require knowl- 


1 1 1 
(n+1?? (a—1?” (nt+1?” 
edge of the binomial theorem for negative integer powers, observe that the series may 


be obtained by repeatedly multiplying the geometric series by itself. In our Chapter 4, 
we see that Newton verified the correctness of his binomial theorem by multiplying 
series. 


1.7. Exercises 


(1) Prove that if C is the circumference and D the diameter of a circle, then 


1 1 
2-2 —12-2" 2-2 -12—-2 


C=3D4 sp ( 


1 
(2-6 —12-@ | ). 


This result, equivalent to (B.5), is easily derived from series (B.2); it is 
contained in the Karanapaddhati, by an unknown author from the Putumana 
illam in Sivapur, Kerala. The result is described: “Six times the diameter is 
divided separately by the square of twice the squares of even integers minus 
one, diminished by the squares of the even integers themselves. The sum of the 
resulting quotients increased by thrice the diameter is the circumference.” See 
Bag (1966) or Pai et al. (2018) pp. 150-151. Also see Srinivasiengar (1967) 
p. 149. 


(2) Compute 


wm 


1 mee ST : 3150) 


where /3 is defined in (1.4). This gives m correct to eleven decimal places. 
In one of his astronomical works, Madhava gave a value of z: “For a circle 
of diameter 9 x 10!! units, the circumference is 2,827,433,38,233 units.” 
This gives the approximate value of m as 3.14159265359, correct to eleven 
decimal places. The Sadratnamala by Sankara Verman of unknown date 
gives z to seventeen decimal places. See Parameswaran (1983) p. 194, and 
Srinivasiengar (1967). 

(3) Prove al-Haytham’s formula (1.20). 

(4 


wm 


This exercise outlines the proof of Paramesvara’s formula for the radius of the 
circle circumscribing a cyclic quadrilateral, as given in the Kriyakramakari. 
First, prove that the product of the flank sides of any triangle divided by the 
diameter of its circumscribed circle is equal to the altitude of the triangle. This 
result follows from a rule given by Brahmagupta (c. 628) in an astronomical 
work, the Brahmasphutasiddhanta. 


1.8 Notes on the Literature 


Next, prove that the area of the cyclic quadrilateral is given by 


a+b+c+d 


A= /s(s — a)(s — b)(s —c) where s = 5 


and a,b,c,d are the lengths of the sides of the quadrilateral. This was also 
stated by Brahmagupta. The Yuktibhasa contains a complete proof. See also 
Kichenassamy (2010), who convincingly argues that Brahmagupta had a proof 


and reconstructs it from indications in Brahmasphutasiddhanta. 


Then, let ABC D’ be the quadrilateral obtained from ABC D by interchang- 
ing the sides AD and CD, so that AD’ = CD = c and CD’ = AD = d. Show 


that if x, y,z denote the three diagonals AC, BD, BD’, respectively, then 


yz =ab+cd,zx = bc+ da,xy =ca+bd. 


This is, of course, Ptolemy’s theorem. Ptolemy’s formula is equivalent 
to the addition formula for the sine function; his Almagest, containing this 
relation, is heavily indebted to the Chords in a Circle of Hipparchus. Bhaskara 
defined the three diagonals in his Lilavati. See Boyer and Merzbach (1991) 
and Maor (1998) pp. 87-94. Finally, prove that the radius of the circle 


circumscribing the cyclic quadrilateral is 


_ (ad + bc)(ac + bd)(ab + cd) 
a (b+c+d—a)(c+tdt+a—b\d+atb—c\atb+d-—d) 


This is Paramesvara’s formula, sometimes attributed to S. A. J. L’Huillier, who 


published it in 1782. See Gupta (1977). 


(5) Use al-Haytham’s formula (1.20) to obtain 


wm 


and 


1 1 1 1 
a n> 4 n* 4 n> n 
2 

k=1 


(6) Derive (1.6) by the methods described in this chapter. 
(7) Prove formulas (B.8) and (B.9), given in Section 1.1. 


1.8 Notes on the Literature 


It seems that the work of Madhava and his followers on series became known outside 
India only when a British civil servant and Indologist, Charles M. Whish, wrote 
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a paper on the subject, posthumously published in the Transactions of the Royal 
Asiatic Society of Great Britain and Ireland in 1835. This journal was founded by 
British Indologists in the early 1830s, though Sir William Jones had first conceived 
the idea about fifty years earlier. Unfortunately, Whish’s paper had little impact. 
Interest in the Kerala school was renewed in the twentieth century by the efforts 
of C. Rajagopal and his associates, who published several papers on the topic. See 
Rajagopal (1949), and Rajagopal and Aiyar (1951), Rajagopal and Venkataraman 
(1949), and Rajagopal and Rangachari (1977), and Rajagopal and Rangachari (1986). 

The Yuktibhasa of Jyesthadeva and the Tantrasangraha of Nilakantha have recently 
been published with commentaries in English by Sarma: Nilakantha (1977) and 
Jyesthadeva (2008). Jyesthadeva (2008) was published after Sarma’s death, with 
additional notes by Ramasubramanian, Srinivasa, and Sriram. This two-volume 
translation with extensive and informative commentary contains both the mathemat- 
ical and astronomical portions; the original Malayalam text extends to 300 pages. 
Sarma (1972) also discusses the Kerala school, but from the astronomical point of 
view. Biographical information on the members of the Kerala school, as well as 
numerous other ancient and medieval Indian astronomers and mathematicians, can 
be found in David Pingree’s five-volume work (1970-1994). 

Readers who wish to read more on the Indian work on series, but with modern 
notation, may consult Roy (1990), Katz (1995), and Bressoud (2002). These papers 
are conveniently available in Anderson, Katz, and Wilson (2004). Also see the 
papers by Parameswaran (1983) on Madhava, Bag (1966) on the Karanapaddhati, 
Gupta (1977) on Paramesvara’s rule for radius of the cyclic quadrilateral, and Sarma 
and Hariharan (1991) on the Yuktibhasa. Plofker (2009) presents a scholarly, detailed, 
and readable discussion of Kerala mathematics, with several excerpts on z translated 
from Sankara Variyar’s Kriyakramakari. She also presents the derivation of the sine 
series with translations from the Yuktidipika and describes Takao Hayashi’s suggested 
reconstruction of Madhava’s remainder term results. In order to derive the continued 
fraction, Hayashi and his collaborators have compared the values of partial sums 
of Madhava’s series for 2 with the then-known rational approximations for z. 
Van Brummelen (2009), on the history of trigonometry, discusses the contributions 
of the Kerala school and relates them to the astronomical work of medieval India. 
In the context of the development of astronomy, Van Brummelen (2009) presents the 
Yuktibhasa derivation of the sine series. This accessible presentation is very helpful, 
since the mathematics of the Kerala school was largely motivated by an interest in 
astronomy. 
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Sums of Powers of Integers 


2.1. Preliminary Remarks 


In his work on spirals, and on conoids and spheroids, Archimedes gave proofs of 
results equivalent to the formulas for the sum of the first n integers and for the sum of 
the squares of those integers; he represented these magnitudes by lines. In the proof 
of his proposition 11 of his study on spirals,! he considered an ascending arithmetic 
progression of magnitudes Aj, A2,..., An, whose common difference was the least 
term A,. He placed the lines representing these magnitudes in order and parallel, and 
extended each line so that it would be equal to A,. Thus, he had 


Ay + An—1 = A2 + An—2 = +++ = An—-1 + Al = An 
and so 
n—1 n+1 
(Ay + Ao +---+ An—-1) + An = 5) Ay + An = 5) An (2.1) 
Taking A, to be a unit = 1, (2.1) then reduces to 
1 
ge a ere Dtn= "et ) (2.2) 


For the sum of squares of integers, Archimedes stated as the tenth proposition of 
his work on spirals, and also as a lemma to the second proposition of his work on 
conoids and spheroids:” 


If Aj, A2....,An be n lines forming an ascending arithmetic progression in which the common 
difference is equal to the least term Aj, then 


(n+ 1I)A2 + Ay(Ay + Ap + AZ +-°- + An) = 3(AT + AS +--+ AP). (2.3) 


! Archimedes and Heath (1953) pp. 105 and 163. 
2 ibid. pp. 107-109. 
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If we take A; = 1, then we can write (2.3) as 


307 422 4...4n2) =(n4+ In? 4 mu 
23 CEI 3 9 
5) 
or 
24224...4 pw (2.4) 


Archimedes applied these formulas to area and volume problems. As mentioned in 
Chapter 1, the mathematician and physicist al-Haytham (965-1040) extended 
Archimedes’ results to cubes and fourth powers by a method that could be extended to 
higher powers as well. In Chapter 1, we also discussed Jyesthadeva’s two methods 
for deriving the results for higher powers. One of these was based on al-Haytham’s 
procedure, given in general form in relation (1.20). Jyesthadeva’s purpose was to 
arrive at the result that for a positive integer k 
k+1 


k n 


as n> ©, (2.5) 


where s® = 1*4+2*+...4n*. Note that for the sequences of positive numbers {a} 
and {b,}, we write 


an ~ by (2.6) 
if 
: an 
lim —=1. (2.7) 
n>oo b 


(k) 
sim, = (2.8) 
equivalent to 
iL x‘ dx = es (2.9) 
0 k+1 


The work of Archimedes, al-Haytham, and Jyesthadeva showed that s© could 
be expressed as a polynomial in n of degree k + 1 for k = 1,2,3,.... However, these 
mathematicians were primarily interested in the highest power term of the polynomial, 
because in modern terms they wished to prove (2.8). 

By contrast, Johann Faulhaber (1580-1635) brought an algebraic and number 
theoretic approach to study of the sums of powers of integers. He found a recursive 
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method for expressing s® as a polynomial and determined some properties of the 
coefficients. Faulhaber was also motivated by a fascination with figurate numbers. In 
Chapter 1, we denoted the nth term of the kth sequence of figurate numbers by V6 
and noted formula (1.32). 

The one-dimensional figurate numbers are merely the consecutive positive integers 
1,2,3, ...,. The two-dimensional figurate numbers are the triangular numbers, where 
the nth triangular number is the sum of the first n consecutive numbers: 


n(n+ 1) 


114+2=3,14+24+3=614+24+34+4=10,..., 5 


The three-dimensional figurate numbers are the pyramidal or tetrahedral numbers 
such that the nth pyramidal number is the sum of the first n triangular numbers: 


n(n + 1)(n+ 2) 


1,4,10,20,..., sovsleey 
6 


These formulas in modern notation can be written as 


£0)-(1 ee 
>) 2 ea) (2.11) 


k=1 


When written this way, it is clear that the figurate numbers are related to the 
number of combinations of k things chosen from m different things, for appropriate 
m and k. It appears that the connection between figurate numbers and combinations 
was recognized by Narayana Pandita whose Ganita Kaumudi of c. 1356 makes this 
explicit in chapter 13, sutra 67, example 30. 

Narayana Pandita also algebraically extended the figurate numbers by taking sums 
of sums of sequences. So the sequence after the tetrahedral numbers would be 


1.14+4=5,14+4+10=15,14+4+10+20=35,.... 


Some earlier mathematicians may have refrained from doing this because they did not 
conceive of dimensionality beyond three as meaningful. In effect, Narayana had the 
formula 


7 k+p-1 n+p 
= : Fil WP ara 2.12 
=| P (2) 7 vee 


k=1 


Note that Narayana’s notation did not allow him to state formula (2.12) in general. 
He showed it true for small values of k and p and indicated that the process could 
be continued. From Narayana’s reference in his chapter 13, sutra 39, to Indian 
mathematicians of an earlier era, we surmise that they may also have been aware 
of formula (2.12). Again, Zhu Shijie, in his 1303 Siyuan Yujian, gave this formula for 
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p = 1to p = 5. From the manner in which Zhu Shijie discusses it, we may conclude 
that he was aware that the formula would hold for all positive integers p. 

The work of the English mathematicians Thomas Harriot and Henry Briggs on 
problems related to interpolation shows that they also understood formula (2.12). The 
German algebraist and arithmetician Johann Faulhaber also independently discovered 
(2.12), but his motivation was an interest in numbers and in particular the figurate 
numbers. However, his results do not seem to have influenced Fermat, Harriot, or 
Briggs.* Note that Fermat used (2.12) to prove (2.8), just as Jyesthadeva had done.° 

Faulhaber (1580-1635) was born in Ulm, Germany, and learned the weaving 
trade from his father. His love of computation led him to study mathematics. His 
knowledge of Latin was not very good, so in the course of his studies, he laboriously 
translated several mathematical texts, ancient and modern, into German. He founded 
a school for engineers in the early 1600s and wrote treatises on arithmetical questions. 
Faulhaber gave an algorithm for expressing se asa polynomial in n; though he 
worked with Bernoulli numbers, he failed to note their significance. It was not the 
practice in Faulhaber’s time to give proofs of algorithms. Two centuries later, in a 
paper on the Euler—Maclaurin formula, Jacobi provided proofs of some of Faulhaber’s 
formulas. 

Around 1700, Jakob Bernoulli gave a simple method for computing the polynomial 
in n for s®, Bernoulli numbers, a sequence of rational numbers, play a significant 
role in the determination of this polynomial. Bernoulli’s interest in the summation 
of finite and infinite series was connected with his study of probability theory. Jakob 
Bernoulli (1654-1705) was the eldest in an illustrious scientific and mathematical 
family, including his brother Johann, nephews Niklaus I, Niklaus II, Daniel, and 
Johann I. In 1676, Bernoulli received a degree in theology from the University of 
Basel, intending to go into the ministry. He then traveled in Europe, coming into 
contact with the Dutch mathematician Hudde and members of the Royal Society. 
These experiences aroused his interest in science and mathematics. In the 1680s, he 
taught himself mathematics by reading short treatments by Leibniz on differentiation 
and integration; he then taught this subject to his younger brother Johann. One of the 
first mathematicians to grasp Leibniz’s calculus, Jakob Bernoulli proceeded to apply 
it to fundamental problems in mechanics and to differential equations. The study of 
Huygens’s treatise on games of chance led Bernoulli to a study of probability theory, 
on which he wrote the first known full-length text, Ars Conjectandi. From 1687 until 
his death, Bernoulli happily served as professor of mathematics at Basel, in spite of a 
salary more meager than he would have received as a clergyman. This professorship 
was occupied by a member of the Bernoulli family for one hundred years. 

Although he spent many years on the problems contained in his probability treatise, 
Jakob Bernoulli never completed it. It appears that he wished to include several 
problems arising out of “civil, moral, and economic matters,” i.e., applications to 
practical situations. For example, even in the year of his death, he repeated his 


3 Hoe (2007) pp. 383-390. 
4 Edwards (2002) pp. 10-15. 
5 ibid. p. 88. 
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earlier request to Leibniz for a hard-to-find copy of Jan de Witt’s work® on annuities 
and life expectancy. Ars Conjectandi was posthumously published in 1713 by Jakob 
Bernoulli’s son Niklaus with a foreword by his nephew Niklaus I. Publication was 
delayed when Jakob’s immediate family, fearing academic dishonesty, refused to hand 
over the manuscript to Johann or to Niklaus I. 

Seki was another independent discoverer of the Bernoulli numbers. The Japanese 
mathematician Seki Takakazu was probably born in 1642 and we do not know who 
his mathematical mentor might have been. However, we know that he studied the 
thirteenth century works of the Chinese mathematicians Yang Hui and Zhu Shijie. 
From Yang Hui’s Methods of Computation (Yang Hui Suanfa) of 1275, Seki read of 
the method for solving algebraic equations with numerical coefficients; this is now 
known as Horner’s method, or the Horner-Ruffini method. Seki made refinements to 
this method and also discovered Newton’s method. Seki made original contributions to 
the theory of determinants. His collected papers are accompanied by a helpful English 
summary of his mathematical accomplishments.’ 


2.2 Johann Faulhaber 


In 1631, Johann Faulhaber published Academia Algebrae in which he listed the 
formulas for 


GUO afk eae gy, k = 1,3,5,...,17, 


given as polynomials in N = a but without indication of a derivation or 
motivation. Donald Knuth found these formulas striking and useful, and he noted 
them in his 1993 paper on Faulhaber.® Note that for odd k, if s® is a polynomial in 
N, then s® is a polynomial in n, but the converse does not hold. Following Knuth’s 


exposition, we note that Faulhaber wrote his formulas in the form 


n 
> PMT = boN*t — bi NE + byNA! — «06+ (-D' I N?, R= 1, (2.13) 
i=1 
where bo,bi1,b2,--- represented positive rational numbers; when k = 0, then 
yoy, i = N. Faulhaber also observed that 


k 


2 
bob = d by» = 4b;_1. 2.14 
= ea an k—2 k—1 (2.14) 


Faulhaber also derived formulas for some even powers of consecutive integers. He 
noticed that (2.13) would hold if and only if 


6 Bernoulli and Sylla (2006) p. 46. 
7 Seki and Hirayama et al. (1974). 
8 Knuth (1993). 
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Se k k-1 
= — — ((k + l)boN* — kb, N 
; ; TOK + 1 | bo : 


+ (k — 1)b,N*-? —... + (—1)*-!2bp_-1N). (2.15) 


In 1834, Jacobi proved formulas (2.13) through (2.15), using a method that clearly 
would have been unfamiliar to Faulhaber.? Knuth has given an interesting reconstruc- 
tion of the methods Faulhaber might well have employed. A. W. F. Edwards!° gave a 
technique for finding the sum of odd powers as a polynomial in N by use of the sums 
of lower order odd powers. In brief, first use the binomial theorem to expand 


PES ar OS DPS? (1) Cees (5) Pee -). 


Next, successively set x = n,n—1,n—2,...,1 and add the corresponding formulas 
to obtain 


k\ eet, (KV Ook 
inn pk=a((T) = v4). (2.16) 


Thus, from (2.16) one may ascertain the sum of the 2k — 1 powers of integers when 
the sums of the odd powers of integers up to the exponent 2k — 3 are known. 


2.3 Fermat 


Fermat rediscovered Narayana’s formula (2.12) in approximately 1635, although he 
apparently never wrote down a proof. Rather, in the margin of his copy of Diophantus’s 
On Polygonal Numbers, he wrote that he had discovered this proposition, calling it 
“beautiful.”!! In a 1636 letter to Roberval, Fermat stated the result: !2 


The last number multiplied by the next larger number is double the collateral triangle; the 
last number multiplied by the triangle of the next larger is three times the collateral pyramid; 
the last number multiplied by the pyramid of the next larger is four times the collateral triangulo- 
triangle; and so on to infinity by this uniform method. 


The first line of Fermat’s description becomes n(n + 1) = 2 )-4_, k; the second 
line becomes mnt Yint?) =o ara taal and so on. Fermat wrote that these results 
had helped him determine the sums of powers of integers and that these in turn gave 


him the areas under the curves y = xk, k = 1,2,3....The example of the curve 


y= x4, as given to Roberval,!? does not indicate a general method. He wrote, 


9 Jacobi (1834). 

10 Bdwards (1986). 

1l Boyer (1943). 

12 ibid. p. 238. 

13. See Mahoney (1973) p. 230. 
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If you multiply four times the greatest number increased by two by the square of the triangle of 
numbers, and from the product you subtract the sum of the squares of the individual numbers, 
five times the sum of the fourth powers will result. 


His description, in our symbolic form, can be given as 
n 


n 2 2 
n*(n + 1) : 
5 ) it = (4n + 2) Z i, 
i=1 i=l 


In general, Fermat possibly had the following procedure in mind: 
Write (2.12) in the form 


n 


1 
eee + p ) = he t 1)---(n + p)). (2.17) 


Note that the product in the summation may be written as a polynomial in k: 
kP + AykP-1 4 AgkP-? 4... + Ap-ak, (2.18) 
and the right-hand side of (2.17) may be written as a polynomial in n: 


1 
pt+l 


(n+! 4 Bin? + Bon?! 4---+ Bon). (2.19) 


As discussed in Section 10.7, the coefficients Aj,...,Ap,—1 and B),...,Bp are 
Stirling numbers, but this fact is not required to solve the problem of finding the sums 
of powers of integers. Observe that by using (2.18) and (2.19) in (2.17) we obtain 


SP ASO Ye AG SP sree A 5) 
(2.20) 


= St + Bin? +--+ Byrn). 


Note that s@ = sn + xn is a polynomial in n of degree 2 and the coefficient of 
the highest power is 5. We inductively assume that for k = 2,3,...,p — 1, s® isa 


polynomial in n of degree k + 1 and the coefficient of the highest power is ae so that 


(2.20) would imply that sv isa polynomial of degree p + 1, with the coefficient of 
: : 1 
the highest power of n being =>. 
This result would be sufficient to show that 


and verifies Fermat’s remark on the determination of the area under curves of the 
form y = x?. Fermat worked out the case p = 4 in his letters to Roberval and 
Mersenne. He clearly thought that the cases p = 1,2,3 had been found in antiquity, 
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but that he had presented a new case. He was apparently unaware!* of the work 
of al-Haytham (965-1040), who had explicitly determined sO, or of the work of 


the Kerala mathematicians or, indeed, of the work of Faulhaber who had found sg? ) 
through p = 17. 


2.4 Pascal 


Pascal also gave a proof that s© — ptt terms of lower power, but his proof 


differed from that of Fermat. Pascal treated this problem in the second part of his 
monograph on the arithmetical triangle, published posthumously,!> where he showed 


that 
k ee k 
nt+iett_—1= & s+ (5) SED Et @ so, (291) 


In fact, Pascal proved a more general formula covering the sums of powers of any 
arithmetic progression, !® rather than the particular case 1,2,...,n. The case (2.21) 
will serve quite nicely, however, to illustrate Pascal’s approach. To verify (2.21) by 
Pascal’s method, start with the binomial theorem for positive integer exponents 


La 
(e+ Det yl = ye ( j en (2.22) 
j=l 
In (2.22), successively set x = n,n —1,...,2,1 to arrive at 
k+1 
(n+ pet! ee Ds (" a ‘) na a 
j=l # 
k+l 7 y . 
(n)k+1 _ (n a {jer _ ys ( ; ) (n _ {yrs 
j=l ? 
k+l 7 sy . 
(n oe ib wa = (n oe ayer = y ( ) (n = aed 
j=l d 
k+1 k 1 
3ktl = gk = ( ve ae 
LU 
j= 
k+1 k 1 
gk _ yAtl 2 ( wy ) yeti. 
ae 


14 ibid. p. 231, footnote 35. 
15 Pascal (1665). 
16 Boyer (1943) provides an English translation of this formula on p. 239. 
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Add the n equations. The left-hand side reduces to (n+ 1)‘+!_1 after cancellation, 
and the right-hand side sums to 


k+1 


ete 


j=l 


implying that if su ) j =0,1,...,k — 1, are known, then 5% can be determined. 


2.5 Seki and Jakob Bernoulli on Bernoulli Numbers 


Seki Takakazu (1643-1708) and Jakob Bernoulli (1651-1705) discovered the 
sequence of rational numbers now designated Bernoulli numbers at around the 
same time. Seki’s contribution to this topic was published posthumously in 1712 
in volume one of his Katsuy6 Sampé;'’ these books have not been translated into 
English, but the editors have given useful comments on and some summaries of the 
contents. In the following year, Bernoulli’s work was also published posthumously 
in his book on probability, Ars Conjectandi.'® Especially since Seki’s book appeared 
earlier, it has been suggested that Bernoulli numbers be renamed “Seki—Bernoulli 
numbers.”!? 

Seki began by expressing in Japanese sentences the formulas: 


1 
SY =1424-.-4+n= (n +n’), 
2 
1 
SM = P4224... n? = E(n + 3n? + 2n°), 
1 
59 = 3434... n= 7? + 2n? + n4), 
1 
Ss = 44244... nt = = n+ 10n? + 15n* + 6n>), 


up to S!!, without explaining how he arrived at these formulas. 
The editors of Seki’s collected papers think it probable that he used a method 
explained earlier in his book. According to this method, Seki would have first assumed, 


for example, that se was a polynomial of degree 3:7° 


174.27 +...4n% =an+bn? 4 cn’. 


Note that because the sum of the terms with n = 0 would be 0, there is no constant 
term. Seki would then have taken n = 1,2,3 to obtain the three linear equations 


'7 Seki and Hirayama et al. (1974). 

'8 Bernoulli and Sylla (2006). 

19 Arakawa et al. (2014), especially p. 3. 

20 Seki and Hirayama et al. (1974) pp. 40-43. 
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at+bt+tc=1 
2a+4b+8c=5 
3+9b+27c = 14 


and would have solved them to find a = é b= 5 c= i. Note that Seki’s approach 
here reflects his extensive work in linear equations and determinants. 

After giving the statements describing the formulas for si) 52) uo Yeon Seki 
presented the method for deriving the general formula. He expanded 


(l+x)’-—1 for n=1,2,3,... 
to get 
l+x—-l=x,(l+x)%? —1=2x4+x7,1+x) —1=3x43x°4+23,...; 
he then introduced numbers, denoted here by Ko, K1, K2, as solutions of the equations 


1= Kp 
OR 
3=143Ki+3K2 
4=14+4K),+6K>+4K3 


(2.23) 


Observe that the integers on the right-hand side of the equations (2.23) are binomial 
coefficients. For example, the integers 1,3,3 in the third row are the coefficients in the 
expansion of (1 + x) — 1. Similarly, the integers in the fourth row are 1,4,6,4, the 
coefficients in the expansion of (1 + x)* — 1. Seki calculated some values of K; and 
then presented the following values in order: 


Oiskis. (2.24) 


Thus, the fifth number in the list, — 45 is K4, and so on. Seki made a table that 
shows that 


SY = sr +2Kjn), 

Ss? = Gi + 3Kn? + 3K»n), 

s® = qin +4Kn? + 6Kpn* +4K3n) 
— Lad 2n? +n? +0n). 


Seki gave no proof of these results and also apparently gave no clue as to how 
he came upon them. Observe that the sequence (2.24) is the sequence of Bernoulli 
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numbers, with the difference that we now take the second number to be — 7 With the 
Bernoulli numbers, defined in the following section, are denoted by By, we see that 


By, = Ky, except that B} = K, — 1. Note that for Jakob Bernoulli, too, K; = 5. 


2.6 Jakob Bernoulli’s Polynomials 


The second part of Bernoulli’s great probability treatise contains results on permuta- 
tions and combinations. He rigorously worked out the connection between binomial 
coefficients and figurate numbers. He thought that he was the first to do this, but Pascal 
anticipated him in 1654, as did others in earlier centuries. Bernoulli also rediscovered 
the formula (2.12) and applied it to the problem of finding the sums of powers of 
integers. Herein he made his enduring discovery of the role played by the sequence of 
rational numbers now named after him. Bernoulli found a pattern in the coefficients 
of the polynomials for se ) that had been missed by so outstanding an arithmetician 
as Faulhaber. 

Bernoulli began by explicitly expressing sv ) for p = 1,2,...,10 as polynomials 


in n:7! 


Sums of Powers 


1 
f° = aim + 5” 
1 1 1 
fe = 3 + ae + Pg 
pe=p ee Si 
4 2 4 
1 
[- n> 4 es De 
5 2 3 30 
1 
fe ==7° oe = n’ * : nn. 
6 2 12 12 
1 
n> — n! I 6 Ls 2 t 7 
7 2 2 42 
1 
[v= nd La f; ae Le 
8 2 12 24 12 
pe=y ie any l 5 fom Es 
9 2 3 9 30 
1 1 7 1 
ic = —n!0 4-7? —n® x +=n* * ———nNn 
10 2 4 10 2 12 
1 1 5 1 5 
ae ni! ni? | ln! «In? «==? & +n, 
11 2, 6 2 66 


7 Hellegouarch (2002) p. 52 for Schneps’s translation; Bernoulli and Sylla (2006) pp. 215-216 also has a 


translation. 
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A. W. F. Edwards has noted that the last term in the polynomial for [ n° should be 


32 1222 
70 , rather than salt: 
.23 


Bernoulli went on: 


Any one who carefully observed the symmetry properties of this table will easily be able to 
continue it. If we let c denote an arbitrary exponent, we have 


1 1 Cc = c(c — 1)(c — 2) = 
c c+1 c c-1 c-3 
= fa +i=A + ————— B 
Jn rect ha 2” 2; 7 2-3-4 _ 


ee — I — He - 3-4) Aes 
Peele Cen: 
{c= DE=~DE= IE ~ He ~ 5le~ 6) 
OB vas5065 728 


Dalo as ey, (2295) 


the exponents of n decreasing by 2 until n or nn is reached. The capitals A, B,C, D, etc. denote, 
in order, the last terms in the expressions of f nn, f n4, f n®, f n® etc. namely 


But these coefficients are so established that each of the coefficients along with the others of its 
order adds up to one. Thus D = — 30° since 


+ oh amare AF z +D=1 
RS A 49 * 

Using this table, it took me less than a quarter of an hour to compute the tenth powers of the first 
1000 integers; the result is 


91, 409, 924, 241, 424, 243, 424, 241, 924, 242, 500. 


This example shows the uselessness of the book Arithmetica Infinitorum by Ismael Bullialdus, 
which is entirely devoted to a tremendously large computation of the sums of the six first powers — 
less than what I have accomplished in a single page. 


If we denote A, B,C, D,... by Bo, Ba, Bo, Bg, ... respectively, then equation (2.25) 
can be written as 
164 2°4.---+no = : nctly eee © Bon! cee DCS. Bint? eae, 
ct+l 2 2 2-3-4 
(2.26) 


where c is a positive integer. Observe that Bernoulli was able to find a recurrence 
relation for the Bernoulli numbers by setting n = 1 in (2.26), obtaining 


1 1 c\ Bo c\ B4 c\ Be 
fees = |; 2.27 
a ae) co 2 (5) 4 (5) 6 wey 


this equation can be used to determine Bp, By, .... Thus, take c = 2 to find By = 
take c = 4 to obtain 


1, 
6? 


22 Edwards (2002) p. 128. 
23 Hellegouarch (2002) pp. 52-53. Bernoulli and Sylla (2006) also has a nice translation. 
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| eae as | Bg 1 
+4—=1 or Bo=-—. 
5 2° 3 4 30 


Observe that the coefficients of n°~?,n°~4, ... in Bernoulli’s (2.26) are all appar- 


ently zero, though this was not explained by Bernoulli. Later in this chapter, we shall 
give Euler’s argument for this point. 

Today, we would sum up to (n — 1)° on the left-hand side of (2.26); we would 
therefore subtract n° from each side of the equation and also insert the coefficients of 
the missing powers of n: 


16+2°+.-.-4+(m-—1) = sau a) Bun + o) Bon! 
+ @ ') B3n? + & ‘) Ban? 3 00+ Bon), 
(2.28) 
where By = —4 and B3, Bs,... are all zero. Now if B.+, is added to the polynomial 


in n in parentheses on the right-hand side of (2.28), the resulting polynomial would 
be the Bernoulli polynomial of degree c + 1, denoted as B.+1(1); we could then write 
(2.28) as 


n-1 


do = 5 (Bet) — Bes). (2.29) 


i=1 


Bernoulli left it to the reader to use “the symmetry properties of this table” to 
figure out how he obtained his general formula for the sums of powers. A little earlier 
in his book, Bernoulli presented a “table of combinations or figurate numbers” and 
analyzed it columnwise.** Apply this idea to Bernoulli’s table on sums of powers. The 
progression in the first column is easy to understand. Now in the second column, factor 
out 5 to obtain the progression 1,1,1,...; these form the first column of Bernoulli’s 
table of figurate numbers. Next factor out the first number in the third column, é 
to obtain the sequence 1, oo) 2 3, ..., and this turns out to be 5 of the sequence 
2,3,4,5,6,... appearing in the second column of the figurate numbers table. The 
fourth column of the sums of powers table consists of only zeros but factor out — 35 
; 3,5, 8 ,14,.... This last sequence 
is equal to i of the fourth column in the figurate numbers ‘able: These observations 
clarify Bernoulli’s comment on the “symmetry properties of the table.” Note that the 
Bernoulli numbers are formed by the sequence of coefficients of m in the polynomial 
expansions of the various sums of powers. Today, however, we take the first Bernoulli 
number to be —} rather than 5 so that the signs alternate. 


from the fifth column to obtain the progression 1 


It is possible, in fact, to derive the Bernoulli-Seki formula for sm) from Pascal’s 
identity (2.21), as Boyer has shown.”> To see this, write the polynomial for se) as 


24 See Bernoulli and Sylla (2006) p. 206. 
25 Boyer (1943) pp. 242-243. 
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nintl 


gan) 
u m+1 


+ Ay(m)n™ + Ao(m)n™—! + A3(m)n™—3 + Ag(m)n™—4 4 ++, 
(2.30) 


where A;(m), A2(m), A3(m), ... are functions of m. Thus, (2.21) takes the form 


amet (Tama (MI ama (TT amt (Mn 


fie m+1 
= (81) (2 aime + atmo) 


m+1 


m 
2 & a ') (— 4 Ayan — Dn! 4 Agim — In™2 4... 
m 


m—1 
ef ea (— EwaGn 2024 Agim — 2" +--- 


Equate the coefficients of n’” on each side to obtain 


m+1 m+1 m+1 
(MP )= (CT) nee 2) ae 
Thus, Aj(m) = 5. yielding Aj(m— 1) = Aj(Qn—2) =---= 5. Similarly, equate 
the coefficients of n”—! to obtain 


m+1 m+1 m+1\1 m+1 1 
("2a ("Taree ("2 a+ (3) gar 


Solve for A2(m): 


— 


1 sem 1 m-Il 
>-—— andthus As(m—-1l)=-- and so on. 
6 1-2 6 1-2 


A2(m) = 
Equating the coefficients of n”~* produces 
m+1 m+ 1 m+1\1m-1 
Co nee ee ae: 
m+1\1 m+1 1 
+( 3 )a+( 4 \ woe 


Therefore, A3(m) = 0. By the same method, equating the coefficients of n’”~3, we 
can determine that 


1 m(m— 1)(m — 2) 
30 1-2-3-4 © 


Aq(m) = 


Substituting these values into (2.30), we arrive at Bernoulli’s formula (2.25). 


2.7 Euler 37 


It appears that Pascal did not give the coefficients of the powers of n, other than 
n*!. Seki and Bernoulli seem to have arrived at their formula for the sums of 
powers of integers by incomplete induction and we have no indication of their proofs. 
Bernoulli was apparently unaware of Pascal’s book on the arithmetical triangle;*° had 
he seen it, he might have derived the proof by using (2.30). 


2.7 Euler 


Both Seki and Bernoulli defined the Bernoulli-Seki numbers by means of the 
equations 


1 
Ko=1, Ki=-c, 
0 1=35 
and 
1 m+1 m+1 m+1 
7 (Kot ( 1 ) mi +( 3 ) Kote ( = ) kn) =1, (2.31) 
where m = 2,3,.... Recall that K; here represents the modern ith Bernoulli number 
B;, except that By = K,; — 1. 
First, note that the generating function for a sequence a1, a2,a3, ... is defined as the 
function 
f(x) =1+a,x4 anx a3x° pweet 
the exponential generating function is then defined for a1, a2, a3, ... as the function 
PF —_— 1 | x | cS | x | 
(x) = hair + a5, r a3 aie 


Euler encountered the same relations as in (2.31) when he discovered the Euler— 
Maclaurin formula, given in a paper of 1739, published in 1750, “De Seriebus 
Quibusdam Considerationes.””’ In this paper, that also contains his asymptotic series 
for f(x) — f(x +h) + f(x + 2h) — ---, now known as Boole’s formula, he found 
the generating function for Co, C1, C2, ..., defined by the equations 


Co = 0 
and 
Cre Cr Cy _ ae m—2¢C 3, m—1 
Cae m—1 WE i m—3 ree. ) 1 _S ) 6 ST 
2! 3! 4! m! (m+ 1)! 


26 Bernoulli and Sylla (2006) p. 99. 
27 Eu. 1-14 pp. 407-462. E 130 § 27. 
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Euler observed that these relations could be obtained by equating the coefficients 
of the powers of z in the equation 


Cerro aa a . fees J = 1 
, ae Tee ee 5 


Euler thus found that the generating function for the numbers in question was 


14 Ciz+ C227 +--= 3 3 
12 7 7.23 72.34 7 °° 
z ze* 
=o = eo (2.32) 


But by equating the coefficients of the powers of z in the relation obtained from 
(2.32): 


2 3 2 3 4 
Z Z Z 
(tetarte (e+ i+ oe )acd ree 


one obtains the relations in (2.31). Thus, the sequence Bi is the same as the sequence 
C; fori > 1. Note that the odd Bernoulli-Seki numbers K2;44.1 = Box+1 fork > 1 can 
be shown to be zero. As Euler noted in sections 24—28 of his “De numero memorabili 


in summatione progressiones harmonicae naturalis occurrente,”2® since Ky = 5 it 
follows from (2.32) that 
K K3 K4 ze* Z 
14+ 52224 4 —2t4+-.-= ; (2:33) 
2! 3! 4! e—l1 2 


whose right-hand side is an even function. Therefore, the coefficients of the odd 
powers of z in (2.33) must all be zero. This fact also follows from Faulhaber’s equation 
(2.13) . Observe that equation (2.13) shows that sem) is divisible by N 2 we 
for m > 1, so that sae is divisible by n*, and thus that the coefficient of n 
in the polynomial for gene) is zero when m > 1. Now the coefficient of n is 
Kom+1 = Bom+1; it follows that Kon41 = Bom+1 = 0 form > 1. 

From this point onward, it makes sense to use only the expression B; and omit 
the use of K; to denote the jth Bernoulli number. Thus, the exponential generating 
function of the sequence of B), Bz, B3, B4, ... by (2.33) would be given by”? 


n 


iy ee ie ge (2.34) 
n=1 “a 7 a 


! ex —] ex —1° 


Euler also found two different proofs that (—1)""' Bo, > Oforn = 1,2,3,.... One 
depended on the formula 


28 Eu. 1-15 pp. 569-603. E 583 § 24-28. 
29 ibid. § 24. 
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(oe) 


x 1 _ (—1)*-1 92k-1 gk Bo 
n2k (2k)! 


(2.35) 


n=1 


Note that the left-hand side of (2.35) is positive, showing that (—1)*—! Box must 
be positive; in Chapter 16 we discuss how this formula may be proved, in connection 
with Euler’s and Spence’s work. 

The second proof used the generating function: In sections 26-29 of his “De 
numero memorabili,’” he observed that since Bay+, = 0 form > 1, (2.33) implied 
that 


=13 Oe he = +=. (2.36 
f@) Be A 6! a2. 2 eo os Ve 
Therefore 
252 
. Zz Zz Lee 
Veg gee Geaip 
2 z z z 
2°  e@&-1 e&-1 (e&-1)? 
22 
=f-ft+ T (2.37) 


Substituting the series for f into the relation (2.37) and equating the coefficient of 
22" forn > 1, on each side, Euler had 


Bon Bon py Bom Bon—2m 


= (2m)! (Qn — 2m)!" 


2 Onl Gn)! 229) 


m=0 


In fact, Euler explicitly wrote down the coefficients for n = 1,2,3,4,5,6 and then 
wrote “etc.” Observe that (2.38) implies 


n-1 
2n 
(2n + 1) Ban = — ye ta Bom Bon—2m; 


m=1 


from which we can inductively deduce that (—1)""!Bo, > 0: It is clear that 
(—1)°Bo = r > 0. Supposing the result true up to n — 1, one has 


n—-1 
= Z = —m— 
(-1)""12n + I)Ban = D7 iG) (1) Bom (1)! Ba(n—m) > O. 


m=1 


Naturally, in the mathematical approach of his time, Euler did not write this 
argument in the form presented here. Instead, he showed that 


Bo > O=> -By>0=> Bh > O => —-Bg3 > 0=> Bien > 0 = -Bir2 > 0, 


at which point he wrote “etc.” 
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2.8 Lacroix’s Proof of Bernoulli’s Formula 


In the first part of his 1755 differential calculus book Institutiones Calculi Differ- 
entialis, Euler suggested deriving Bernoulli’s formula, (2.25), by means of finite 
differences, but he did not provide any details. In the second part of his book, Euler 
derived (2.25) using the Euler—Maclaurin summation formula. Sylvestre F. Lacroix 
(1765-1843), in the third volume of his important text on calculus, summarized 
Euler’s ideas on finite differences and then indicated how they could be worked into a 
proof of (2.25). 

Lacroix investigated partial differential equations under the tutelage of Gaspard 
Monge but did not pursue mathematical research. Rather, at the urging of Condorcet, 
he decided that his broad knowledge of eighteenth-century mathematics should be put 
to use in the writing of elementary and advanced mathematics textbooks. These books 
were widely popular, going into numerous editions and translations. I here summarize 
Lacroix’s treatment of Bernoulli’s formula (2.25). 

Sylvestre Lacroix essentially redefined the first Bernoulli-Seki number to be —4 


instead of 5 by writing the sum se as a polynomial in n. To see his approach, 
consider 


1 1 
yt ee aoe 


sm) = T ’ 
se pt+l 2 


subtract n” from each side to get 


1 1 
so = ou i (wn = & ‘i ') 5 
P 


Be (” v ') Bont af é e ') Ban ae ). (2.39) 


2 3 
Now let Bj = —5 to obtain the modern definition of the Bernoulli (or Bernoulli— 
Seki) numbers as the sequence By = 1, By = —4, Bp = i ... aS defined by the 
equations 
1 1 1 
Bl Bee eee eee 
1 2 3 
eee e ') Bm =0, m=1,2,3,.... (2.40) 


Equations (2.40) are obtained upon taking n = 1 in (2.39), because se = 0 for 
n = 1. Also observe that the B), B2, B3, ... are uniquely defined by (2.40). On pages 
69 and 70 of his Traité des différences et des séries, published in 1800, Lacroix gave 


a proof*? of Bernoulli’s result (2.39). 


30 Lacroix (1800) pp. 69-70. 
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Using Lacroix’s notation, let )) x’” denote the sum ey k™ = S(x). Note that, 
unlike Euler and Bernoulli, Lacroix took the sum up to x — 1, rather than x. With this 
in mind, he had 


S(x +1) — S(x) = x”. 
Now Lacroix did not use subscripts, as in Ax, but we use the modern notation: Assume 


m+1 


k=0 


Then 
m+1 m+1 Pa +1 k 
m_ m+1—k — ,m+l—-k\ _ - m—k 
x =Da(ern x J=Lar(( i )s 


4 m+1—-k gmk m+1—-k eye 
2: m—k 


Equate the powers of x to get 


1 1 1 1 1 
Ag =———; Ay = Ag = 8 Ay = Ag MEO — a Bao 
m+1 2 2 2-3 2 6 2 
1 —1 —1 —1 
Aya Ag tt Demin =D) _ yg mm=D 4 =D _ oe 
2-3-4 2-3 2 


Lacroix wrote that from these equations one could successively obtain the values of 
the coefficients Ax, and he explicitly gave the values of Ax,k = 0,1, ...,20. 
Although he did not work out the general case, it is easy to do: Write A; = 


m+1 . 
ds ' and equate the coefficient of x™—k to get 


CHEE CT) GCE Ee 
CRYO aan 


oe m+ 1 
Divide by ( k+ | to find that 
k+1 k+1 k+1 k+1 
( 0 ) a0 ( 1 Ja+( > Jarre +( k Ja =o. 
Observe that since ao = 1, the equations defining aj,a2,a3,... are identical to 


(2.40), so that aj = B,, dy = Bo, a3 = Bs, and so on. In this manner, Lacroix has 
demonstrated the Bernoulli—Seki formula. 
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2.9 Jacobi on Faulhaber 


In an 1834 issue of Crelle’s Journal, Jacobi published an important paper giving 
a rigorous derivation of the Euler—Maclaurin summation formula with remainder 
term.*! We discuss this paper in Chapter 20. Near the end of this paper, however, 
Jacobi gave a very brief treatment of Faulhaber’s work on sums of powers of 
integers, but without mentioning Faulhaber. I saw a copy of Faulhaber’s book*? in the 
Cambridge University Library. It is stated on the title page that that book had belonged 
to Jacobi; on the previous blank page “J. F. Pfaff” is written. It appears that Jacobi may 
have perhaps acquired the book after the death of Pfaff in 1825. It is thus possible that 
Jacobi was in possession of Faulhaber’s book before writing his 1834 paper. 

Jacobi wrote the formulas for )77_, i7*~! in powers of u = n(n + 1), rather than 


in powers of N = nt) He also observed that 


n 


1 d j2k+l 
ai 2 =e (2.41) 


this follows immediately from Bernoulli’s formula (2.26) or (2.28), since Bop41 = 0. 
Moreover, he noted that 


i a eee 
or Mie _ oe j2k 1 Box. (2.42) 


Jacobi began the last part of his paper by explicitly writing the sums for k = 
253, he gh 


1 
n 
1 
O22 2 2 ees 
yk =-uU (« ;). 
x= 1 
n 
4 2 
Oe, 20 = 
De: #(u su +3). 
x=] 


zs 1 287 11 1 1 
ee 4 35 4 28 OG 8 2 | 69 , 69 
6 15 3 15 30 


To obtain a general form of the sum )77_, i 2k—-1 Jacobi set 
yi 2k-3 —— (ui! See ag ee (— 1)" ax—3u”) (2.43) 


31 Jacobi (1969) vol. 6, pp. 64-75. 
32 Faulhaber (1631). 
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and 
ud 1 
So 5 (we — bu) + bow? — ... + (—1)*bp-2u?). (2.44) 
i=l 


He then stated that the coefficients aj,i = 1,...,k —3 andb;, j = 1,...,k —2 
would satisfy the relations 


2k(2k — l)ay = (2k — 2)(2k — 3)by — k(k — 1), 
2k(2k — l)az = (2k — 4)(2k — 5)bo — (k — Ik — 2)b4, 
2k (2k — 1)a3 = (2k — 6)(2k — 7)b3 — (k — 1)(k — 3)b2, 


2k(2k — l)ag_-3 = 5 - 6bg_-3 — 3 - 4bg_4, 
0 =3-4by_2 — 2 - 3dg_3. (2.45) 


Observe that these relations show that if a; > 0,1 = 1,2,...,k — 3, then b; > 0 
for j = 1,2,...,k — 2. The relation (2.45) shows that by_3 = 2b,_2. Taking note 
of (2.41), we have completed the proof of Faulhaber’s statements (2.13), (2.14), and 
(2.15). Note that (2.15) is essentially the derivative of (2.13) with respect to n. 

Though Jacobi did not provide details of the proof of the relations between a; and 
b;, they are straightforward. They follow from an application of (2.41) through (2.44) 
and observing that since u = n* +n, gu = 2n + 1. Taking the second derivative of 


d 
(2.44) with respect to n produces 


n 
(2k — 2) D0 7k? + Bor 


i=1 


2 
a ku’! 1)byu*—2 + (k — 2)bou*-3 — ... + 1)* 2by_ 
TkOk =D (k — 1)byu"~ + ( )bau + (—1)* 2by_2u) 
(2n + 1)? k-2 k-3 
sey ee Ge 2a = (k= De 2b 
ORI * (RE Dak — = DE 2b 
+ (k — 2)(k — 3)bou*~4 — -.. + (—1)* 2by_2) 
au = ain? + an = (EI) aps + Bags. (2.46) 


Now note that (2n + 1)* = 4u + 1; apply this in (2.46) and equate the coefficients 
of the respective powers of u to obtain Jacobi’s relations among the a; and bj. 


2.10 Jacobi and Raabe on Bernoulli Polynomials 


Observe that equations (2.41) and (2.42) are not clearly stated: n represents a positive 
integer in the sum )~?_,, while it is a real variable in the expression £. In this section 
we will employ a very simple result on polynomials, used by Jacobi, to show how this 
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inconsistency can in fact be ignored. First let n denote an integer variable and x denote 
a real variable. Thus, since for any integer k > 0, Bernoulli’s formula would state that 


n-1 


— k+1 k+1 2 
k k+l k Pei tee 
ye =i ( I ) Bin +( > ) Ban + 


i=1 


a5 & ') Bun? + Bi), (2.47) 


Now define Bx (x), the kth Bernoulli polynomial, by 


a d 1 k+l k+1 k k+1 2 
B(x) = ax (Hl: 1 Bix" eee + > By-1x + Bix) 
kt, (k bet CK k-2 k 
2 Ade 1 Bix + 2 Box Se fl By_-1x + Br. (2.48) 


Recall from (2.24) that 


1 1 
Bees, Bee = BS 0, BS 
1 2 2 6 3 4 


30° Senn 


hence, all nonzero terms following after the term (1) B,x*—! must be of the form 


k k—2s 
CG Bos x ; 


Observe that, as we have seen in (2.29), (2.47) may be written as 


n—-1 
B B 
jk k+1(n) — a (2.49) 
k+1 
i=l 
and that 
d 
aS By(x) = kBa_-\(x), k=O. (2.50) 
x 


The general theorem on polynomials, used by Jacobi, is not difficult to prove and 
in fact could have been provided in the seventeenth century. Cauchy* stated it thus: 
Suppose P(x) and Q(x) are polynomials of degree at most n, and for at least n + 1 


distinct complex numbers x1,*2,...,%n41, P(xj) = Q(x) with? = 1,2,...,n4+ 1; 
then P(x) = Q(x). To verify this, Cauchy first showed that if a polynomial vanished 
at x = x1,X2,...,Xn+41, then it was divisible by (x — x1)(x — x2)--+(% — Xn41). 


He next assumed that P(x) was not identical with Q(x) and pointed out that when they 
were not identical, P(x) — Q(x) was a nonzero polynomial of degree at most n and it 
vanished at x = x1, ...,%,41. Hence, it was divisible by (x—x1)(x—x2) +++ (X—%Xn41), 


33° Cauchy (1989) § 4.1. 
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a polynomial of degree n + 1. This was absurd; therefore, P(x) = Q(x). Thus, x can 
be replaced by n as in equations (2.41) and (2.42). 

The author has not found any application of this principle before the nineteenth 
century, and it in spite of Cauchy’s work, was used only infrequently until about 
1850. For example, in a paper of 1845 on the g-extension of the binomial theorem, 
Eisenstein used this principle on polynomials, but then gave an explicit statement of it 
in a footnote, apparently reflecting his impression that the idea was not well-known. 
Joseph Raabe did not use this principle in his 1848 work, Die Jacob Bernoullische 
Function,** although it would have been natural to do so and would have greatly 
shortened his proof of a result on Bernoulli polynomials. 

Interestingly, however, the mathematical basis for the proof of this theorem was 
actually available in the seventeenth century and would seem to have been readily 
achievable by Descartes or Newton. In his 1821 work Analyse algébrique,*> Cauchy 
proved a theorem whose attempted proof by Euler had had a gap; Cauchy completed 
the proof by use of this theorem on polynomials. We discuss this result in Chapter 4. 

As an application of this general theorem on polynomials, consider Jacobi’s proof 
of the formula*® 


Bel — x) = (—1)* By (x). (2.51) 


Although Jacobi proved (2.51) only for the case in which k is even, the proof 
actually extends to all cases. Jacobi first noted that (2.49) would imply 


n n—-l 
Bn +1) =k 14+ BeSk Oi + Be tn! 
i=l i=l 
= B,(n) +kn'!, for n = 1,2,3,.... 


Thus, we can state that 
Be(x + 1) — By(x) = kx}, (2.52) 


because it holds true for an infinite number of values x = 1,2,3,.... Now when k 
is even, then every term in By(x) must be even with the exception of —5xk-1, when 
k is odd, the reverse is the case. Therefore 


(—1)*-! By(—x) + Be(x) = —kx*!. (2.53) 
Adding (2.52) and (2.53) then produces 
Bex +1) + (- D1 Be(—x) = 0. (2.54) 


Changing x to —x in (2.54), Jacobi obtained (2.51). 


34 Raabe (1848). 

35 Cauchy (1989). 

36 Jacobi (1834) or Jacobi (1969) vol. 6, pp. 64-75. See equations (16) through (22) and the remark given after 
equation (19). 
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In section 4 of his 1834 paper on the Euler—Maclaurin formula, Jacobi gave a 
method of finding the generating function of the Bernoulli polynomials. He actually 
gave the generating function for only the even-degree polynomials, but his method 
clearly applies in general. He first employed (2.49) and B,(n) = n + B, from (2.48) 
to obtain 


(e“e) tk (e“e) n—-1 tk 
DRO =1t Beet [kD T+ Be) 
k=0 : k=2 j=l : 
coo 6 6n—l (eve) k 
(jt) t 
a1tot ayer Oy ae 
k=1 j=l k=2 
leva) rk n—-1 
= 1+>0 Bur tarts bie! = 1) 
k=1 j=l 
t n—-1 
= 4 it 
Se get De te 
j=l 
te,(n—lt 
7 t Hang e(e” 1) 
e —1 e—] 
Pa: oat OR 
ee | ga, eat 


Jacobi thus obtained the generating function for Bernoulli polynomials in the 
integer variable n. Now take x to be a real or complex variable and suppose, as Jacobi 
himself could have, that 


Clearly, Cy(x) is a polynomial of degree k and Cx(n) = Bx(n) for all positive 
integers n. Hence Cy(x) = B(x) and the generating function for the Bernoulli 
polynomials is 


xt 


~ f°. He 
ROS = Zs : (2.55) 
k0 i e 


Joseph Raabe called (Be (x) — Bx) the “Bernoullische Function” in his 1848 book 
of that name. In this book he proved some interesting properties of B;(x), including 
what is now called the multiplication formula for Bernoulli polynomials: 


n—-1 


By(nx) =n! 5 By (: + ‘) ay (2.56) 
j=0 , 
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Raabe first proved (2.56) for the case in which k is an odd integer; when k = | the 
result is immediate. Raabe observed that when k is odd and 1 < j < n —1, (2.51) 
would imply that 


B (1-4) 4+8.(4) =o PSP. ds (2.57) 
n n 


Adding the n — 1 formulas in (2.57), Raabe concluded that 


n 


J\_ 
S> By (4) = 0. (2.58) 


j=l 


He next noted that, with r a positive integer, a repeated application of (2.52) would 
imply 


B(x +r) = Bx) $k 1+ @ tye +e-+@4r—-DE!). (2.59) 
Setting x = a j =1,2,...,n —1 in (2.59) and adding the n — 1 equations, Raabe 
obtained 


n—-1 


a . n—-1 ke n—-1r-1 m+ ij k-1 
By (« + ‘) = >> Bi (4) + Br) +k (“*/) (2.60) 
j=0 j=l j=l s=0 


Since Raabe took k as odd, By = 0 and (2.29) produced 


Re) =k 43 4 OR) 


“(Gy +B) CY). 


Thus, (2.58) implied that the right-hand side of (2.60) could be rewritten as 


k 
pel 


GST note Gir Di) = 2 Bunn), (2.61) 


This actually completes the proof of (2.56) for k odd, since both sides of the 
equation are polynomials and the equation holds for an infinite number of integers. 
But Raabe did not draw this conclusion at this point. After (2.61), he gave a lengthy 
argument to prove that r could be taken to be a positive rational number and then went 
on to show how to extend the result to negative rationals. Finally, he applied the idea 
of continuity to further extend his result to all real numbers. However, we omit the 
details of this part of Raabe’s reasoning because, in fact, he had already completed 
the proof of (2.56) for k odd when he demonstrated that it was true for an arbitrary 
positive integer x = r. 

To prove (2.56) for even k > 0, take the derivative of (2.56) for odd k and then use 
(2.50) to conclude that (2.56) is true for all k > 0. Now Raabe did not give a proof for 
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k even in this manner, but presented a much more elaborate argument. Recall that for 
Raabe the kth Bernoullische Function was 


C(x) = (Bex) — Bx), 
implying that while 
Bogyi(x) = (2k + 1)CrK41), 
he had 
Box (x) = 2kC (x) + Bog, 


leading him to a more involved argument for k even. 


2.11 Ramanujan’s Recurrence Relations for Bernoulli Numbers 


Jakob Bernoulli’s recurrence relation (2.27) is not too practical to use in computational 
situations; to find Bz, we require the values of Bp, Ba, ..., Bon—2. Taking Bernoulli’s 
recurrence relation as having a gap of 2, Ramanujan’s relations have gaps of 
4,6, 8, 10,12,14 and these are clearly much more efficient for computations. For 
example, given a formula with gaps of 6 with Bz known, one can immediately obtain 
Bg and then Bj4, and so on. We give one such example from Ramanujan’s paper, 
“Some properties of Bernoulli numbers.” Bruce Berndt, editor of Ramanujan’s 
notebooks, has commented on this paper: “It is fitting that Ramanujan’s first paper 
is on Bernoulli numbers, for he clearly loved them. They permeate much of the work 
in his notebooks.”” Ramanujan stated his result:>8 


Suppose n is an odd integer; then 


G) |Bn—3| + (5) |Bn—9| + (3) |By—15|+--- =0, (2.62) 


where the constant term is 


ye aye ae 
6 3 3 


depending on whether n is of the form 6k + 1, 6k +5, or 6k + 3. 
Observe here that since 


i(e7!* +1) 
cotx = ee ie (2.63) 


37 Ramanujan (2000) p. 357. 
38 Ramanujan (1911) p. 222. 
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(2.34) implies that we can state 


a (2x)2k 
x cotx=1+ x aa he 7 eee (2.64) 
k=1 (2k)! 


We will prove (2.62) later in this section. For now, observe that the series (2.64) 
has only even powers with odd powers missing. Thus, we have a series with gaps of 2 
between the powers of x. In order to obtain relations among Bernoulli numbers with 
gaps of 6, we must use (2.64) to construct series with further gaps of 3 among the even 
powers of x. This method for constructing such series was published in 1759 by the 
self-taught English mathematician Thomas Simpson. We discuss this general method 
in Section 13.3 and here describe two particular cases of this method using square 
roots and cube roots of unity. 


Suppose 
f(x) =an9 tayx +anx? +a3x°+--- 
Then 
f (—x) = a9 — ax +anx* —agxr+---, 
s(F@) + (9) = ay tanx*+tagxt+-:-- 
and 


1 
5(f@) — f(x) = ax bao? asc 


Thus, Simpson’s suggested method has allowed us, from a given series, to produce 
two series with gaps of 2 by using the square roots +1 of unity. In a similar manner, 
we can use cube roots of unity to produce gaps of 3. Denote the cube roots of unity by 
1,@,@”, where w = lea es and note that 


lto+o*=0 and wo = 1. (2.65) 


Next, let us consider how to produce gaps of 6 from (2.64). First, to avoid the 
alternating signs in (2.64), we take the absolute values of Bz, and use (1! By, = 
| Box | to obtain 


Bae se eine 3 (2.66) 
oe ek ge Reh ag Reh ee eS 


2 


Changing x to wx and then x to w” x in (2.66) we arrive at the two equations 


—Wx OP as ny Or eer cae 267 
cot _ BOO Batay Gal gy Polar (2.67) 


2 2 
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2 2 2 4 6 


pS oo Be ey Bie = Bae 2.68 
5s SO + ow | al > Por Bal Belen (2.68) 


Upon adding (2.66), (2.67), and (2.68), while applying (2.65), we find that 


sd (ree eae ren ee ie Bel 2 IB a 
+ @ + @ —_— = | een 
a 53 5) 2 Orel nota 


(2.69) 


If we multiply (2.66), (2.67), (2.68) by 1,w, w” respectively and then add the results, 
we atrive at 


x2 x8 x14 
=3( 16 7 + | Bg| 81 + |By4| id] ). (2.70) 


Again, multiplying (2.66), (2.67), (2.68) by 2,2w*, 2 respectively, and then adding 
the resulting equations, we obtain 


Ne oe (os 6(\B = \B nak \B pa 
co + CO + CO = { { free J, 
es ae 2 2 nay a6 | a ees A 

G71) 


Ramanujan discovered recurrence relations for | B2,| with gaps of 6 by finding other 
expressions for the left-hand sides of (2.69), (2.70), and (2.71).°? He observed that 


If l,a, w~ be the three cube roots of unity, then 
Asinx sinwx sinw?x = —(sin2x + sin2mx + sin2w2x), (2.72) 
as may be easily verified. 


One way to verify (2.72) is to make use of the two trigonometric identities that 
follow from the addition formula for the sine and cosine functions: 


2sin A sin B = cos(A — B) —cos(A + B), (2.73) 
2sin A cos B = sin(A + B) + sin(A — B). (2.74) 


Now by (2.73) and (2.65) 


2sinx sinwx = cos(1 — w)x — cos(1 + w)x = cos(1 — w)x — cos wx; 


multiply by 2 sin w*x and apply (2.74) to obtain 


39 Ramanujan (1911) pp. 221-222. 
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Asinx sinwx sinw’x = 2sinwx cos(1 — w)x — 2sin wx cos wx 


=sin(l —@4 w’)x + sin(w 4 ig 1)x sin2w*x — sin0 


= —sin 2x — sin2x — sin2w*x. 


This completes the proof of (2.72); Ramanujan next took its logarithmic derivative 
to get 


2(cos 2x + wcos 2xw + w* cos 2xw” 
cotx + wcotax + w” cotxw” = ( ) 


sin 2x + sin 2wx + sin2w2x 


Then, writing > for x and multiplying by >, he found another expression for the 
left-hand side of (2.69): 


x x Wx 2 wx x(cos x + wcos wx + w* cos wx) 
cot > + wcot + w” cot : : a) 
2 sin x + sinwx + sinw*x 


(2.75) 


Applying the power series expansions of sinx and cos x, given in chapter 1, he 
could express the right-hand side of (2.62) as a quotient of two power series with gaps 
of 6. Combining these with (2.69), Ramanujan arrived at 


6 yl? cae Le eee 
i nC 
3( 1+ [Bol & + |Bi2l 5 4 -)= 2 (2.76) 
: : 3 OO TST 
To obtain a similar formula for (2.70), he noted that 
' x ‘ wx COS wx — COS Wx 2(cos wx — cos wx) (2.77) 
co co = = : : 
2 2 2sin * sin & sin _ sinx + sinwx + sinw*x 


For the next step, Ramanujan multiplied (2.77) by —5 (w* —@) and added this result 
to (2.75), arriving at 


x x 2 Wx wx —x(cos x + w* cos wx + w cos w*x) 
cot + @ cot + wcot — : : 
2 2 sin x + sinwx + sinw2x 
(2.78) 
This led to the relation 
4 10 16 
2 8 14 x x10 | x 
x x x 4. tor ter 
i i eee — 
3 (120 a + Bal gp + Bil 7 ) x Fy ae, (2.79) 
S-otig-c 
Ramanujan then wrote, 
Similarly, 
Pe eee x eet Wx tate wx y We + ne ax + iid wx — 3) (2.80) 
sinx + sin@x + sin wx 
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and therefore 


6 12 18 
jie a ie) ee (2.81) 
ep eGR a AB! eae acne Uae Ls 
ire se ae 
To prove this as Ramanujan did, write the left-hand side of (2.80) as 
= X gin © gin @* 4 cos 2 sin 2 sin 2% + cos @ sin 2 sin 2 
x (cos 5 sin S$ sin %* + cos $ sin 5 sin S* + cos S* sin 5 sin 5 is 


2 
. Sere Wx a: Ww 
sin Do} sin 7 sim > 


Now apply the identities (2.73) and (2.74) to the first term in the numerator of 
(2.82); obtain the other two terms in the numerator by changing x to wx and x to wx 
respectively. For the denominator, use (2.72). 

Ramanujan multiplied each of the equations (2.76), (2.79), (2.81) by the denomina- 
tor on the right-hand side of each and then equated the coefficients of x”, arriving at the 
result stated in (2.62). He then used this recurrence relation to calculate the absolute 
values of the Bernoulli numbers up through B49; we show how he found Be, Biz, Big 
in this manner. He employed the case for which the constant term was (—1) a6 
Now, since Bg is positive, n = 9 yields 


n—3 
Sarit 


Vee eee 
CF eam as aR?) 


15 15 
- (9) Be - (9) B+ 4=0% 


divide by (?) to find that 


With n = 15, we get 


eek ahs ge a rn, dM 0 
in OT 455. 42 °° 455 27307 


After taking n = 21 and dividing the resulting equation by er we come to 


Bie dS Biot 204 |, 3 B 43867 
—Bo == or = —. 
18 12 5 6 665 18 708 


As Wagstaff’s paper*? has pointed out, many of Ramanujan’s results on Bernoulli 
numbers were anticipated. 


40 Wagstaff (1981). 
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2.12 Notes on the Literature 


Seki’s collected works were first published in 1974 in Japanese. However, the editors 
very helpfully added an English summary of Seki’s main achievements, including the 
results on Bernoulli numbers. In her 2006 translation of Bernoulli’s Ars Conjectandi, 
Edith Sylla has given a preface and an excellent 126-page introduction. 


3 


Infinite Product of Wallis 


3.1 Preliminary Remarks 
In 1655, John Wallis produced the following very important infinite product: 


a eae 
xn 2 4 4 6 


(3.1) 


This result appeared in his Arithmetica Infinitorum, published in 1656.'! The 
passage of 350 years has not diminished the beauty and significance of Wallis’s 
result, the culmination of a series of remarkable mathematical insights and audacious 
guesses; his book exercised great influence on the early mathematical work of Newton 
and Euler. We note that in 1593 Frangois Viéte gave the only earlier example of an 
infinite product, a calculation of the value of 7 by inscribing regular polygons in a 
circle.” His formula can be written as 


2 V2 V2+V2 y24+v2+v2 
=~. ; sta 


IU 


John Wallis (1616-1703) apparently received little mathematical training at school 
or at Emmanuel College, Cambridge. He taught himself elementary arithmetic from 
textbooks belonging to his younger brother, who was going into a trade. It was only 
during the English Civil War (1642-1648) that Wallis’s mathematical inclinations 
began to be evident as he decoded letters for Parliament. The code operated by 
replacing letters with numerical values. Wallis gained a feeling for numerical rela- 
tionships through this experience, and he applied it to his mathematical researches for 
the Arithmetica Infinitorum. In fact, the manner in which he presented and analyzed 
the mathematical data in his book is reminiscent of the way in which he decoded 
messages. 

It was probably around 1646 that Wallis began delving more deeply into mathemat- 
ics, by studying the famous Clavis Mathematicae by William Oughtred (1574-1660), 


1 Wallis (1656). 
2 Vidte (1593) p. 30, second leaf. 
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inventor of the slide rule. First published in 1631 and composed for the instruction 
of the son of the Earl of Arundel, this book was widely studied and exerted a 
tremendous influence on seventeenth-century English mathematics. A second edition 
in English and then in Latin appeared in 1647 and 1648. The second edition was 
among the first mathematical texts studied by Newton as a student in 1664. In the 
1690s Newton recommended that the book be reprinted for the new generation of 
students of mathematics. The Clavis introduced Wallis to algebraic notation and to the 
method of applying algebra to geometric problems in the manner developed by Viéte 
in the 1590s. 

In 1649, Wallis was appointed to the Savilian Chair of Geometry at Oxford. The 
valuable service Wallis had provided to the winning side in the Civil War helped him 
attain this post. The Savilian Chair, endowed by Sir Henry Savile in 1619 to promote 
development of mathematics in England, was the second endowed mathematics chair 
in England; the first was founded in 1597 at Gresham College, London. With the rapid 
advancement of mathematics after 1550, it had become clear that university instruction 
in mathematics was essential, especially since this subject was proving useful in 
navigation and military matters. In fact, Italy and France had already established a 
number of mathematics professorships. 

At the time of his appointment, Wallis knew little more than the contents of the 
Clavis. But the professorship gave him access to the Savile Library with its fine 
collection of mathematics books. Wallis was most influenced by Frans van Schooten’s 
1649 Latin translation of Descartes’s La Géométrie and Evangelista Torricelli’s Opera 
Geometrica of 1644. Although Oughtred and Viéte had employed algebra in the study 
of geometry, Descartes took the process to a higher level by reducing the study of 
curves to algebraic equations by means of coordinate systems. At around the same 
time, Pierre Fermat (1607-1666) also made this major step, but his expositions on this 
and other topics were unfortunately published only posthumously. Wallis’s first book, 
De Sectionibus Conicis, written in 1652 and published in 1656, was clearly inspired 
by Descartes. He obtained properties of conic sections algebraically, making extensive 
use of the symbolic algebra developed by Harriot and Descartes. Wallis defined the 
parabola, hyperbola, and ellipse by means of algebraic equations; he remarked that 
“Tt is no more necessary that a parabola is the section of a cone by a plane parallel to 
a side than that a circle is a section of a cone by a plane parallel to the base, or that a 
triangle is a section through the vertex.” 

Wallis learned of Bonaventura Cavalieri’s method of indivisibles from Torricelli; 
Wallis regarded his own Arithmetica Infinitorum as a continuation of Cavalieri, an 
accurate assessment. Wallis spent a fair amount of his book computing the area 
under y = x” when m was a positive integer, using an arithmetical approach, as 
contrasted with Torricelli’s geometrical method. Wallis then extended the result to the 
case m = ‘ where n was a positive integer, by observing that the curve y = xn was 
identical to x = y” when seen from the y-axis. Now when the area under y = x” on 
the interval (0,1) was added to the area under x = y”, taken on the same interval on 


3 Wallis and Stedall (2004) p. xiii. 
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the y-axis, the result was a square of area 1. But since Wallis had already found the 


1 
area under y = x” to be ae the area under y = x* turned out to be* 


a re 
ae Fee 


(3.2) 


Wallis then jumped to the conclusion that the area under y = x7 over the unit 
interval, where m and n were positive integers, was> 


1 


In the Arithmetica, Wallis’s aim was to obtain the arithmetical quadrature of the 
circle. In modern term this means that he wished to evaluate the integral fo (1x7) rdx 
using numerical calculations. Since (3.3) gave the value of fo xn dx, Wallis’s plan of 
attack was to compute i (d—x D )4dx for positive integer values of p and qg and then 
interpolate the values of the integral for fractional p and q. The area of the quarter 


1 
circle was obtained when p = gq = 5. To compute ia — xP)%dx for integer q, 
Wallis expanded the integrand and integrated term by term. For example, for g = 3 
one has (in modern notation) 


1 1 
1 
[a-xntax = [oa 348 +305 - xP de = 1 se Ee 
0 0 are + 1 


In proposition 131, he tabulated thirty-six values of these integrals (or areas) for 
1 < p,q < 6, of which we present the reciprocals: 


3) A 3 6 7 
6 10 15 21 28 
10 20 35 56 84 
15 35 70 126 210 
21 56 126 252 462 
28 84 210 462 924 


NAYADUNFWNhN 


Here the rows are given by p and the columns by g. Wallis observed that these 
were figurate numbers. For example, the second row/column consisted of triangular 
numbers, the third row/column of pyramidal numbers, and so on. It was already known 
(though Wallis may have rediscovered this) that these numbers could be expressed as 
ratios of two products. Thus, as discussed in our Section 2.1, the numbers in the pth 
row were given by 


(q+ 1@+2)---@+ Pp) 
p! , 


4 ibid. propositions 54-57. 
5 ibid. proposition 59. 
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Therefore, if 


1 
w(p.q) = wa 
ie (1 —x?)¢4dx 
then Wallis had 
1 DY oe 
wea (q+ 1I(q +2) @+p) (3.4) 


p! 


Wallis then assumed that the formula continued to hold when q was a half integer. 
Of course, p could not be taken to be a half integer since neither the denominator 
nor the numerator would have meaning in that case. However, for p = 5 the integral 
would be ied — x?)4dx; this could be easily computed when q was an integer. So 
Wallis had a row corresponding to p = 5 and in proposition 168 he got the values of 
w(5.¢) for g = 0, 1, 2, 3,... as 


mena! is (+1) (+2) 105 _ (3 ! me | 2) (4 | 3) 


i Page 2! * Ag 3! nota 3% 


To find w(5,q) when q was a half integer, he observed that (3.4) implied (in our 
notation) 
pt+qtl 
w(p,q + 1) = w(p,q) ————.. (3.5) 
qtl 
From this relation, he could get the value of w( 5 5 + n), for integer 1, in terms 
of w(5, 5). So if A denoted w(5, ) = <, then proposition 189 stated that the row 


2 
corresponding to p = 5 and gq = 5 0, , 1, 3, 2, 3, --» would be 
1 3 4 3x5 4x6 3x5x7 
~A, 1, A, ’ ’ a ’ = ’ a S ’ (3.6) 
2 2 3 2x4 3x5 2x4x6 


Wallis understood that (3.5) provided the rule for forming the subsequence of 
the first, third, fifth, ... terms and the subsequence of the second, fourth, sixth, ... 
terms, but he was initially unable to see how the two sequences were related. Wallis’s 
research was stalled at this stage in the spring of 1652. He consulted a number of his 
mathematical friends at Oxford including Christopher Wren, the famous architect, but 
none could help him. Three years later, he informed Oughtred of the progress he had 
made and where he was still stymied, ending his letter® with the request, “wherein 
if you can do me the favour to help me out; it will be a very great satisfaction to 
me, and (if I do not delude myself) of more use than at the first view it may seem to 
be.” Apparently, Oughtred could provide no assistance and eventually in the spring 
of 1655, Wallis requested help from Brouncker, who sent back an infinite continued 
fraction to solve the problem. It is likely that Brouncker’s solution inspired Wallis 


6 Wallis and Stedall (2004) p. Xviii. For the full letter, see Rigaud (1841) vol. I, pp. 85-86. 
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to discover his own very different one, though some speculate that Wallis made his 
discovery independently. 

William Brouncker (c. 1620-1684) may have studied at Oxford around 1636, 
though he told his friend John Aubrey that he was “of no university.’ However, 
Brouncker was very proficient in languages as well as mathematics. He did all 
his surviving mathematical work in association with Wallis, with the exception of 
his series for In2. In addition to the continued fraction for 2, he wrote a short 
piece on the rectification of the semicubical parabola y = x2, probably after seeing 
William Neil’s work. He also gave a method for solving Fermat’s problem of finding 
integer solutions of x7 — Ny? = 1 for a given positive integer N. This solution 
can also be described in terms of continued fractions, but when Wallis wrote up 
Brouncker’s method, he did not use that form. A letter of 1669 from Collins to James 
Gregory,® suggests that Brouncker found the series for (1 — x2)2 independently of 
Newton. Indeed, Charles II chose Brouncker as the inaugural President of the Royal 
Society, a post he held from 1662 to 1677. The Society’s Philosophical Transac- 
tions was founded during his tenure; the April 1668 issue contained his proof of 
the formula? 

1 1 


1 
fave ! cata q 
Dir ag, aaa ene e) 


Brouncker provided no explanation of how he obtained his very intriguing result 
on the continued fraction for 7 and in his book, Wallis presented only a sketch 
of Brouncker’s argument. In the course of this discussion, Wallis included a short 
account of a few fundamental results on continued fractions, including the recurrence 
relations satisfied by the numerators and denominators of the successive convergents 
of a continued fraction. Brouncker’s result, as well as Wallis’s exposition of it, 
suggests connections between continued fractions and series, products, integrals, and 
rational approximations. It is surprising to note that, although Huygens and Cotes gave 
isolated results, no mathematician before Euler made a systematic study of continued 
fractions. Wallis’s book had a tremendous impact on Euler who, at the age of 22, 
used it as his starting point for his theory of gamma and beta functions. At about the 
same time, Euler began his investigations into continued fractions, as indicated by 
a 1731 letter from Euler to his friend Goldbach.!° He explained how he had applied 
continued fractions to solve a Riccati equation. Shortly after that, he began researching 
the relation between continued fractions and infinite series, infinite products, and 
integrals. It is a remarkable fact that when Euler chanced upon a mathematical avenue 
or by-path, such as those suggested by Wallis, he explored it with vigor and almost 
always found numerous results of interest and value. 


7 Stedall (2000) p. 295. 
8 Turnbull (1939). 

9 Brouncker (1668). 

10 Fuss (1968) pp. 56-59. 
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3.2 Wallis’s Infinite Product for z 


Although he did not give an explicit definition, the concept of a logarithmically convex 
sequence is crucial for understanding Wallis’s derivation of the product for 2. Wallis, 
Newton, and Euler all made use of this idea. A sequence of positive numbers {a,} is 
called logarithmically convex if 


Inay < 5 (Iman 1 + Inan4), w=1,2,;3%->%: (3.8) 
or 
a> <Gn—-1dn41, n=1,2,3,.... (3.9) 
Now a sequence {a,} is logarithmically concave if 


a> Spends WH 1,2, 33 esc (3.10) 


In addition, a positive function f on an interval (a,b) is called logarithmically 
convex if f is continuous and if for every pair of points x1,x2 € (a,b) 


1 
In f (= =) < 5 (In f(a) + In f (x2). 


Wallis, we may recall, was searching for a rule capable of describing (3.6) in some 
form. He eventually arrived at the deep insight that the sequence of the reciprocals 
was logarithmically convex. This allowed him to express the first term of the sequence 
as an infinite product. To reach his insight, Wallis first denoted the numbers in the 
sequence (3.6) by the letters a, a, B, b, y, c, 5, d etc. He observed in proposition 191 
that the ratios 

B 2b 3 y 4c 5 6 6 


7 
’ ’ ’ ’ ’ ’ etc. 
ao la 2B 3b 4y 5 € 6 


were decreasing. He then assumed the same for the ratios - a Fr, etc. This meant 


a 

Qa? 

that a2 > aB,B* > ab, b* > By, and so on. So, if we denote three consecutive 
members of (3.6) by an—1, Qn, An+1, We must have 


1 
a* > Gn—1dn+1, 49 = 54. (3.11) 


Since this indicates logarithmic concavity, the reciprocals must be logarithmically 
convex. Wallis wrote down the first few of these inequalities explicitly. Thus, c? > y6 


and 5? > cd gave him 
ee 
A< , 
2x4x4x6V5 
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3x3x5x5 aie 
> : 
2x4x4x6V6 
In general, we can write these inequalities as 
3x3x-+-x2n-1x2n—1 /2n4+1 A 
2x4x---x2n—2x2n V In ~ 
3x3x---x2n-—1x2n-1 2n 
<Ox4x 0x In—2x2n V2n—1 
Studying the pattern evident in only the first few cases of these two inequalities, 
Wallis concluded that 


4 3x3x5x5x7xT7x::: 


A= ‘med fs 
4 2x4x4x6x6x8x-:-: 


(3.12) 


Newton studied Wallis as a student in the winter of 1664-65 and made notes in a 
notebook now held by the University Library, Cambridge. Here Newton observed!! 
that Wallis’s proof of (3.12) could be simplified, writing in his notebook, ’ Thus Wallis 
doth it, but it may bee [sic] done thus.” He noted that the sequence (3.6) was increasing, 
though he did not explain why. Observe, however, that the terms of the sequence are 
the reciprocals of the integrals 


1 
1 1 
13°)" dx mS =, 0) ay 1 oss 3.13 
[a—2ymas, m= 5.0.5 3.13) 
The integrand decreases as m increases and hence so does the integral. Therefore, we 
see that 


3x5x-+-x2n—-1 4x6x::-x2n 3x5x-+»x2n—-—1x2n+1 
< < 
2x4x-+--x2n—-—2 3x5x-+»-x2n—-1 2x4x-+-x2n—-—2x2n 


And these two inequalities together imply (3.12). Newton’s argument certainly 
shortened the proof of Wallis. But Wallis’s use of (3.11) gave a deep insight into the 
connection between interpolation of factorials and logarithmic convexity. Note that 
the inequality (3.11) implies the logarithmic convexity of the sequence x This was 
fully understood only in the 1920s, when Bohr and Mollerup showed that logarithmic 
convexity was one of the defining properties of the gamma function, by which the 
factorial is interpolated; in this connection, see our Chapter 17. Thus, as Bourbaki also 
commented,!* Wallis’s methods are very similar to those used today in the theory of 
the gamma function. It is possible that by 1890 the Dutch mathematician T. J. Stieltjes 
had also gained an understanding of the significance of logarithmic convexity as it 
related to the gamma function. 

To understand more clearly the meaning of the logarithmic convexity of the 
sequence in (3.13), observe that m — 1mm + 5 are three successive values of m, 


11 Newton (1967-1981) vol. 1, p. 103. 
12, Bourbaki (1994) p. 187. 
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where the least value of m is 0. If we denote the integral in the equation (3.13) by Win, 
then the logarithmic convexity of the sequence in that equation entails that 
w 


m 


Wea Waa (3.14) 


and this is the inequality being assumed by Wallis. 


3.3. Brouncker and Infinite Continued Fractions 


Indian mathematicians between 700 and 1500 discussed finite continued fractions. !3 


We have noted that it is possible that the Kerala school also had a conception of 
infinite continued fractions. It seems, however, that the first explicit discussions of 
infinite continued fractions appeared in the works of two professors of mathematics at 
the University of Bologna: Rafael Bombelli (1526-1572) and Pietro Antonio Cataldi 
(1548-1626). In 1572, Bombelli described a method for computing //13,!4 amounting 
to the continued fraction expansion 


observe that in this notation, the left-hand side denotes the continued fraction 
4 


is 
6+ —4— 


Though Cataldi’s work appeared later than that of Bombelli, he may fairly be 
regarded as the creator of the theory of infinite continued fractions. He explained 
how to expand the square root of a number in terms of fractions in such a way 
as to clearly show that an infinite continued fraction must result.!> Moreover, he 
introduced a modern notation for continued fractions, also used by Wallis. Cataldi 
also gave the recurrence relations satisfied by the successive convergents of the 
continued fraction representation of a quadratic irrational. Finally, he showed that the 
convergents were successively larger and smaller than the continued fraction and that 
they converged to it. 

Brouncker utilized continued fractions to present an ingenious solution to Wallis’s 
longstanding problem of finding the law of formation of the sequence (3.6). He stated 
that the continued fraction 

Les 32 “Be 


=n-4 tee, = 0,1,2,3,... 3.15 
Net 2n+ 2n+ 2n+ e ( ) 


13 See Brezinski (1991) chapter 1. 

14 ibid. pp. 62-64 gives excerpts from the first and second editions of Bombelli’s algebra book, where he 
described the method for finding the continued fraction for V1. 

15 ibid. pp. 65-70. 
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had the two properties: ! 
o(n—1)o(n+1)=n*, n=0,1,2,... (3.16) 
and 
@()=—=A. (3.17) 


It follows from these properties that the mth term of the sequence (3.6), starting at 
1 rather than at 4, is given by 


Se etces = 1,2,3 (3.18) 
‘ ‘ air 5 WBS 1 LZ yas os : 
2 61) $3) ¢@m—1) 
If we take the empty product in (3.18) to be 1, then form = —1, we also get the 


term 4 in (3.6). 
Wallis was able to prove (3.17) from his formula (3.12) combined with (3.16).!7 
We note briefly that by (3.16), 


Vee 2? 6? 2? 67... (4m — 2)? 
a ~ 42 67) 42.82... (4m)? 
1 37-5%-.-(2m—1)? o(4m4+ 1) 
~2°2.42...(Qm—1)22m am 


o(4m + 1) 


(3.19) 


Now by (3.15), 1 < @(n) <n-+ 1, and, therefore, 


4m +1 (4m + 1) 4m +2 
< < : 
2m 2m 2m 


If we let m — oo in (3.19), then these inequalities and Wallis’s formula imply that 
g(1) = 4. Wallis did not give a complete proof of (3.16), but one may reconstruct 
his thought from the arguments he gave. He wrote that Brouncker had noticed that 
the product of two consecutive odd or even numbers was one less than a square, since 
(n—1)(n+1) = n?—1. He then asked by what fraction the factors should be increased 
so that one obtained n? rather than n” — 1. We may say that he looked for a function 
(n) such that 


d(n—1l)d(n+1) =n’. 


Since $(n) = n gives n* — 1, we take 


a) 


CT aay: 


(3.20) 


16 See also Eu. I-14 pp. 291-349. E123, § 15. 
!7 See Wallis’s commentary on proposition 191 in Wallis and Stedall (2004) pp. 168-178. See also 
Stedall (2000) pp. 300-305. 
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where q is a constant to be determined.!* Substituting in (3.16), we get 


—gi(n—Ddint+ I +an+ Ddia+1) tain — Dditn— 1) +07 = 0. 
(3.21) 


The symmetry of (3.16) is preserved if we take a; = 1, for then (3.21) can be 
written as 


G@ia—-)-M+))@iat+1)—-M—-D) =n. (3.22) 
Now let 
a2 
=2 ——., 3.23 
oi (1) n+ BG (3.23) 


so that (3.22) simplifies to 


~9go(n — 1)$2(n + 1) +.a2(n + 3)b2(n + 1) + a2(n — 3)2(n — 1) +05 = 0. 
(3.24) 


If we take a2 = 37, then we get an equation similar to (3.22): 
(d2(n — 1) — (2 + 3)) (G22 + 1) — (n — 3) =n’. 


So set 
a3 


b2(n) = 2n + iy: 


and it turns out that w3 = 5*. One can continue in this way to get the continued fraction 
expansion (3.15). 

Wallis’s contribution to the theory of continued fractions was to note the recurrence 
relations for the convergents of a general continued fraction.!? Take a continued 
fraction 


C = bo + — — -::-, (3.25) 


and set the nth convergent (or approximant) of the continued fraction to be 


Py, a, a2 an 
c= =bo4 s+, n= l,2,3,... 3.26) 
On bit bot by : 
with Po = bo, P_1 = 1, Qo = 1, and O_, = 0. Then Wallis’s recurrence relations for 
the numerators and denominators P,, Q, of the convergents can be written as 


Py = by Pn—1 + Gn Pn—2, (3.27) 
On = bn Qn—-1 + 4nQn-2. (3.28) 


18 See Whiteside (1961b) pp. 211-212. 
19 Wallis and Stedall (2004) p. 176. 
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Wallis wrote the continued fraction (3.25) with bo = 0 as 


and gave the first four convergents. He stated the rules (3.27) and (3.28) in words and 
showed how it worked by an example. He remarked that these results allowed one to 
compute the convergents by starting at the beginning of the fraction rather than from 
the end. The twelfth-century Indian mathematician Bhaskara, in his Lilavati (1150), 
also gave the rules (3.27) and (3.28).7° Since he considered continued fractions of only 
rational numbers, the value of a, was 1. 


3.4 Méray and Stieltjes: The Probability Integral 


In his 1730 work on the interpolation of the sequence of factorials,*! Euler noted the 
relation between an integral and Wallis’s infinite product: 


eel (3.29) 


We remark that Euler did not use the idea of a limit; he wrote the infinite 
product instead. However, a change of variables and integration by parts produces 
the probability integral: 


CO 
/ eegy = NE (3.30) 
p 2 


Although Euler’s 1730 paper did not contain a proof of (3.29) or (3.30), several 
proofs were well-known by the time Charles Méray published his 1888 paper,” 
“Valeur de l’intégrale définie tn e-*'dx déduite de la formule de Wallis.” In his 
paper, Méray wrote that the integral (3.30) played a considerable role in the theory 
of least errors and that he had presented in his paper an interesting proof with the 
advantage of complete rigor. His paper showed how the integral under discussion 
could be directly expressed in terms of Wallis’s product; this may well have been a 
new development. Thus, his derivation is worth studying for its own sake, and because 
Méray’s work has been somewhat overlooked by the mathematical community. 
His 1869 paper,?> “Remarques sur la nature des quantités définies par la condition 


20 Brezinski (1991) pp. 32-33. 

21 Bu. L-14 pp. 1-24. E 19. Also see our Chapter 17. 
22 Méray (1888). 

23 Méray (1869). 
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de servir de limites 4 des variables données,” gave the first published version 
of a theory of real numbers. Now Weierstrass had presented his theory of real 
numbers in his Berlin lectures in the 1860s and in 1858 Dedekind had worked 
out his theory, published in 1872, using Dedekind cuts,** although these were not 
published until later. And Méray’s 1869 paper was ignored in France because during 
that period, interest in this subject seems to have been limited to Germany. Upon 
reading Méray’s 1888 paper on the probability integral, Stieltjes produced a simplified 


argument that is also of interest. 
Méray started with the integral 


22 2 
In = x"e* dx (3.31) 
0 
with n > O an integer. Writing xMe-® = xe x"! and integrating by parts yielded 
n ,—x? 1 —x? n-1 n—1 n—2,,—x? 
x"e* dx =—-~e* x + —— | x" “*e™ dx, (3.32) 
2 2 
or 
n—1 
Ih= In-2 (3.33) 
2 
Thus when n was even, say n = 2m, 
I _ 2m—1 2m —3 1, (3.34) 
2m = 2 2 2 0, é 
and with n odd, say n = 2m + 1, 
m! 
lomg1 =m! Ty = z" (3.35) 
With these values in hand, Méray made the substitution in integral (3.31) 
1 
_ (ne 1 \2 
= 5 y 
to obtain 
1 n—1 = _n-l 
Ih= 5 5 e i (3.36) 
where 
2 n—1 
Th ai (eye *) 2 dy. (3.37) 
0 


24 Dedekind (1872). For a translation into English, see Dedekind (1963). 
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He next observed that e ye” was increasing for y < 1, decreasing for y > 1, and 
equal to 1 when y = 1. This implied that 7,, would decrease when n increased; thus 


Tom41 < T2m < T2m-1. (3.38) 


He wrote 7, in terms of /,, using (3.36), so that (3.38), combined with (3.34) and 
(3.35) gave him 


2. _ 2m+1 
1 


2m\— a 2m m! 2m — 1 2 2m-1 2m — 1 
Se e2 ed e 2. --- Io 
2 2 2 2 2 


(= — as 2m—2 (m — 1)! 
< i en eee 


2 2 
Divide across by m~e™ Gab! to obtain the inequalities 


Lt. Cran 3st 


LN es 29 
i ef fes “3 I 
<( =) © im JmQm—2y--4-2. °° 


and using Wallis’s formula to arrive at 


> 1. 27-4%...(2m — 2)? 

Ij = = lim - (2m) 
2 moo 32.52... (2m — 1)? 

i 4 

OL 


Stieltjes wrote in an 1890 paper on the same topic”? that Méray’s proof became 
simpler when one noted that the sequence J, was logarithmically convex, a fact for 


which he gave a very easy proof: 
He observed that for an arbitrary real number x, 


(oe) 
Tei POSE ar ay al u"—"(y + x)2e"" du > 0, 
0 


equivalent to 


(Int + In)? > 12 = In—-Un, 


25 Stieltjes (1890). 
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so that Stieltjes could conclude that 


| gees Pep (3.39) 
To see this, simply take x = — ys . From (3.33) and (3.39), Stieltjes found 
eke (3.40) 


Inequalities (3.39) and (3.40) produced the two inequalities 


2 2 
Ts, > and Ts < Tng—1 Io 41. 


2 2 
—— 1 
ok ae 1 2k+1 
Therefore by (3.35), Stieltjes had 


3 Cie? Benak) 2 _ (1:2-3---4)? 


or 


At this point, Stieltjes used (3.34) to conclude that 


(2-4 632k)" 
2 = 1+e), 
0-7.3-5-Qk—-D)ak+p it? 
ue a/ IU 
2 = Ip = . 
(a) pa 0 2 


This was clearly a more direct route to Méray’s result. Now Stieltjes’s argument 
may be used to prove Wallis’s conjectured inequality (3.14): 


1 
Wah + 2x Win + 2° Wiy 2 2 (1 = 22)" 2 (x? 1 2x(1—12)7 4+(1 1°))dt 


1 
a (1—#?)""2(x + = 2)2)"dr > 0, 
0 


implying Wallis’s inequality: 


3 
| 
Nis 
3 
qt 
Nie 


3.5 Euler: Series and Continued Fractions 


Wallis’s discussion of Brouncker’s continued fractions convinced Euler of their 
importance in analysis. Quite early in his career, he found a connection with the 
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Riccati equation*® and saw the necessity of relating continued fractions with series, 
products, and definite integrals. In this way, Euler succeeded in fleshing out the 
methods of which Wallis and Brouncker had only given key examples. 

Euler presented his general theorems on the conversion of series to continued 
fractions in such a way that the nth partial sum of the series and the nth convergent 
of the continued fraction were identical. Euler’s first paper on this topic, of 1737,7/ 
treated this topic somewhat briefly but the second one, of 1739,78 was more detailed. 
It explicitly stated the formulas for obtaining the corresponding series starting with a 
given continued fraction and, conversely, for obtaining the continued fraction from the 
given series. 

We follow Euler’s approach from the first book in which he treated this topic, 
Introductio in analysin infinitorum;?? he discusses continued fractions in his chapter 
18. He wrote a continued fraction in the form 


e+ etc. 


Using subscripts to clarify Euler’s expressions for the modern reader, we replace 
a,b,c,d,... by bo,bj,b2,b3,... and a,B,y,6,... by a),a2,a3,a4,... so that in 
modern notation, Euler’s (3.41) would be written as 


a a @ 
bee ws, 3.42 
OT bit bot 3+ Oa2) 


He first observed that the successive fractions would be 


bo 
71? 
bold a Nala 
by by 
ia aq bobi bz + boaz + ayb2 
by + b bjby +.a2 , 
bye a - bob, b2b3 + bob3az + b2b3a, + bob\a3 + a1a3 
by + roan bi bob3 + b3an + bya , 


For brevity, we may denote the successive fractions by 


Ag At Az Az Ag 
Bo By,’ Bo’ Bz Bg’ : 


26 Fuss (1969) vol. 1, pp. 56-59. 

27 Bu. 1-14 pp. 187-216. E71. 

28 Eu. 1-14 pp. 291-349. E 123. 

29 Buler (1988) provides a translation into English. 
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Euler placed the fraction , as the first fraction of his list of successive fractions so 
that the recurrence relations for forming the numerators and denominators would hold 


when n = 1. Thus he had: 
1 Ao Ay Azad Az Ag 


9 9 ) b b ’ eG 3.43 
O Bo Bi Bo By Bg ( ) 
and he could state these rules: 
An = by An—1 + Qn An—-2, n= 1,2,3,... (3.44) 
By = by Byn-| + QnBn-2, n= 1,2,3,..., (3.45) 


where A_; = 1 and B_; = 0. Observe that Ag = bo and Bo = 1. 
To obtain the series corresponding to a given continued fraction, Euler considered 
the differences of the successive fractions:*” 


B, Bo 
Using (3.44) and (3.45), he found 
An An-1 ~ An Bn—1 — An—1Bn 


By Bn-1 By Bn-1 
_ (bn An—1 + Gn An—2)Bn-1 — An-1(bn Bn—-1 + Gn Bn—-2) 
- Bn Bn-1 
= Gn (An—2Bn—1 — An—1Bn-2) 
7 Bn Bn-1 
= —ay(An—1 Bn—-2 — An—-2Bn-1) 
7 Bn Bn-1 
_ Gn Gn—1(An—2 Bn—3 — An-3Bn-2) 
7 Bn Bn-1 
= (-1)""! GnQn—14n—2°°** a) 
7 By Bn-1 : 


Euler noted that the successive partial fractions (3.43) were given by 


Ao Ao, (Ai Ao\_ Ai Ao, (Ai Ao\ | (A2_ Al) Ad 
Bo Bo \Bi Bo B} Bo \Bi Bod \B Bi By’ 


The nth partial fraction would thus be given by the series 


An = Ao (3 3) eg sy & er) 
By Bo’ \Br Bo) \ Bn Bn-1 
= Ao a| a\a2 aj\a2za3 beds ( pyr! ajda2:::ay 
Bo BoB, Bi Bo By B3 Bn-1Bn — 


30 Euler (1988) pp. 306-308. 


70 Infinite Product of Wallis 


Following Euler, consider the case Ag = bo = 0, so that oe could be expressed as 
n 
an alternating series: 


An ay aja2 | 414243 -( yyrat a2 an 
ete 


B, BoB: BiB. ~ BoB; BiB, 


Now, assuming the existence of limy-, oo a, he had the infinite continued fraction 
n 
(3.42), bo = 0, expressed as a series: 


ay 4142 | 414243 


(3.46) 
BoB, B,By BoB 


Euler next showed how to obtain a continued fraction from a given series;?! denote 
the series by 


Ci —C24+C3-—C4+C5—-+:-. (3.47) 


Comparing (3.47) with (3.46), he found that 


Gein pas IGE Ge ae Ca eee (3.48) 
BoB, Bi By By Bs Bn-1 Bn 


Euler observed that (3.48) gave him 


C Be 
ae ca ae ee (3.49) 
Ch-1 By 


subtracting each side of (3.49) from (3.44) and then using (3.45), he obtained 


Cy—1 (By — ay Bn— Cy—1Bn—1b 
on Gi n 1 ( 7 an Pn 2) = n 7 1 e ke (3.50) 
n n 


Taking the product of the differences in (3.50) yielded 
Cn-1 Bn—1bp ; Cn Bnbn+i 


(Ch-1 ar Cn) (Cn — Cn+1) = 


Bn Bn+i 
Re aa lina 
Bn+i 
Thus 
Basle. Cn-1CnPabatt he ee (3.51) 


Bn-1 - (Ch-1 _ Cr) (Cn _ Cray 
Finally, from (3.48), (3.49), and (3.51) he concluded 


Coby by 
= Cpa 


a,=Cibi, a 


31 ibid. pp. 308-313. 
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C1 C3b2b3 C2C4b3b4 
= ; a4 = ’ 
(Cy — C2)(C2 — C3) (C2 — C3)(C3 — C4) 


a3 (3.52) 


Euler then observed that, since the numerators of the continued fraction, that is 
a1,42, 43,..., were known, the values of the denominators b,,b2,b3,... could be 
arbitrarily chosen. That is, if C1,C2,C3,... were integers, then b1,b2,b3,... could 
be so selected that a1, a2, a3, ... would turn out to be integers. Thus, taking the values 
of by, b2,b3, ba, ... to be 1,C, — C2, C2 — C3, C3 — Ca, ... respectively, then 


aj=Cy, a2=C2, a3 =CiC3, a4 =C2Cy,.... 


The continued fraction corresponding to (3.47) would then be, in modern notation: 


Ci C2 C1C3 C2C4 
1+ Cy — Cot Cy — C34 C3 -— Cy 


On the other hand, Euler observed, if the series were 


1 1 1 1 


Pasty (3.53) 
Ci C2 C3 
then the equations in (3.52) could be written as 
by Ci bi by C3 bob; 
aj=—; OS ar a3 = ? 
Ci -Ci (C2 — C1)(C3 — C2) 
C3 b3b4 
a4 = : ’ 
(C3 — C2)(C4 — C3) 
He could then take 
b=Ci, b2=C2—-Cy, b3=C3—Co, bg =Cy— Cs, 
to see that 
a\ = I ay = C?, a3 = C3, Gi= C5 
Series (3.53) could thus be converted into the continued fraction 
1 c. Cs C3 
! 2 3 (3.54) 


Cit Cy—Ci+ C3— Cot Cu —-O3$ 


As an example of the series in (3.53), Euler considered 
1 1 
! 2 3 
log2 = ——dx = (LS oes Se eee dx 
o Il+x 0 


et a tes tine 
oF ee A ee 
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Since C, = 1, Cp = 2, C3 = 3, C4 = 4, the corresponding continued fraction was 


1 1 4 9 16 
t 1+ 14 14+ 14 


log2 = 
og i 


Then again, 


rf 1 Poe kod by 
as age Be 
so that C; = 1, Co = 3, C3 = 5, C4 =7,... and 


x 1 1 9 25 49 
A Ta Doe Oa es , 


note that Brouncker discovered the reciprocal of this result. 
More generally, Euler observed that 


1 xn 1 
. ax = f gO OE a dk 
0 0 
1 1 1 1 


n m+n 2m+n 3m+n 


7 1 n? (m+n)? Qm +n)? Bm+n) (3.55) 
7 n+ m+ m+ m+ m+ , , 


where the last step follows from (3.54). 


3.6 Euler: Riccati’s Equation and Continued Fractions 


Euler found continued fractions for e, its square and cube roots, and other related 
numbers. In his first paper on the topic,*” “De Fractionibus Continuis Dissertatio” 
written in 1737 and published in 1744, he explained that he had initially found 
these expansions by studying the patterns in the continued fractions for the rational 
approximations of these numbers. It was only later that he attempted to prove the 
results. In the process, he discovered a connection with the Riccati equation and he 
employed this to establish his formulas.*° It is interesting that Euler gave the main 
theorem of this paper in his 1731 letter to Goldbach.** For e he had the expansion 


go Re ae (3.56) 


obtained by taking the approximation e = 2.71828182845904 and applying 
the division algorithm. Cotes had earlier given this expansion by applying the same 


32 Bu. 1-14 pp. 187-216. E71, § 21-22. 
33 See our Section 14.9. 
34 Fuss (1968) pp. 57-59. 
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procedure.*> He used the continued fraction (3.56) to obtain rational approximations 
for e, noting that the successive convergents were alternately bigger or smaller 
than e. To find a continued fraction for 2, take the approximation m ~ 3.1416. 
Observe that 


177 1 1 
es raab = 1250 1250 ~ 74 I 
177 177 
“1 a 
ve 1 Ts aa 
As iis C4 
eR ee id 


Similar to Cotes’s method, Euler took ./é = 1.6487212707 and found 
1112121 21 «21 «21 «21~«'1 


ames be 57 
ve ea ey ee reed ee ee ee ear ee on) 
Then again 
char rer er ee ga eee ee (3.58) 
~ 5+ 18+ 30+ 424 544 ; 
and 
et+1 i ~<a iG St OE 
=—24 setae ; 
e=1 6+ 10+ 144+ 18+ 224 264 oon 


He observed that in (3.56) and (3.57), the arithmetic progressions of the denom- 
inators 2,4,6,... and 1,5,9,13,... were interrupted by consecutive 1’s, whereas 
in (3.58) and (3.59) they were not. He showed how to convert the interrupted 
progressions into non-interrupted progressions. When he applied this procedure to 
(3.56) and (3.57), he got 


re ee ae eee ae men 
BSE ee Bs (Oa. Mae Tee OO. 264 


and 


2 1 1 1 
=14 ree, 3.61 
ve 3+ 124 20+ 284 vee 


Euler then noted that he had not really proved any of these expansions and that it 
was only probable that the arithmetic progressions continued in the manner indicated. 
He wrote that after some exertion he had found a rigorous though peculiar proof that 
related the problem to differential equations. He stated without proof the theorem*® 
that if 


35 Cotes (1714) p. 11. An English translation of this paper is available in appendix 1 of Gowing (1983). 
3© See E71 § 28. 
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1 1 1 1 1 
ope Sun Bae nets Sos! (3.62) 
p' p' Pp | prey 
where p = (2n + Ix FT, then y satisfied the differential equation 
dy + y?dx = x7 #4 dx. (3.63) 


Euler’s expression for g also contained a parameter a but this can be taken to be equal 
to 1 without loss of generality. 

It is possible to give an inductive proof of this theorem and it is very likely that 
Euler had discovered that argument. Note that when n = 0 and when n = 1, (3.62) 
takes the form 


1 1 

qg=y and g=——. (3.64) 
pt 
x3y 


The corresponding differential equations would be 


dy + y*dx = dx, (3.65) 


dy + y°dx =x73dx. (3.66) 


In this way, the solution of (3.66) required the solution of (3.65). More generally, 
the solution of the Riccati equation (3.63) depended on that of (3.65). However, Euler 
easily solved (3.65) by observing that it was equivalent to 

dy 1 i+y 
—_ =x 


fans or an 


Since x = p and y = q forn = 0, Euler wrote the solution as 


eeP 4] 


= (3.67) 


q 


or, in modern terms, g = coth p. Euler observed that when n was an infinite number 
in (3.62), then 


ape ae ee 
Se mies 
pt 4 34 24 


q (3.68) 


The result (3.68) is now called Lambert’s continued fraction, although Euler found 
it earlier. Now, since e2? = 1+ rae Euler saw that 


N 
— 
— 
— 
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or 


1 2 1 1 1 
7=14 day 
2s —1+ 6s+ 10s+ 14s4 ee) 


He then noted that (3.69) would in fact produce all those continued fractions he had 
obtained experimentally by using rational approximations. 


3.7 Exercises 


(1) Prove that 


-1 
14:9 «16 7a ee 
Oe Soh dae, dae a 9 1+x2 , 
See Eu. I-14 pp. 292-297. 
(2) Show that 


m+ nt bt mtntctmtntdt | 


1 1 1 
= (mn + l)a+n oe 
mn+1 (mn + l)b+m+nt+ (mn4+1lc+tm+nt 


See Euler (1985) p. 313, and Eu. I-14 p. 205. 
(3) Show that 


ge 1 1 1 
= rnpg+n+g- see be 
Pq 1° phot O+ Po+ O+ Pdt+aq4 


where P = mnpq +mn+mq+ pq and Q=mnp+npqt+m+n+p+gq. 
See Euler (1985) p. 318, and Eu. I-14 p. 208. 


(4) Show that 


aye 
ee eee 
25+ 24+ 2+ 
by Oe 
3=1 sae 
v3 1+ 24 14+ 2+ 


See Euler (1985) pp. 307—308 and Eu. I-14 p. 200. 


(5) Show that ifx =a+ 4 ,b---,thenx =a—$4,/1+ %. See Buler (1985) 
p. 308, and Eu. I-14 p. 201. 
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(6) Show that if x = a4 a ma vee = a . ++, thatis, if x is a periodic continued 
fraction, then x satisfies a quadratic equation. See Eu. I-14 p. 203. Euler stated 
the result in words, as opposed to symbolically. 


(7) Show that 


UX eS haw ae oS 
tan = 
4 14+ 24 24 24 


See Stieltjes’s letter to Hermite of March 4, 1891, in Baillaud and Bour- 
get (1905) p. 157. 


3.8 Notes on the Literature 


The introduction to Stedall’s excellent English translation of Wallis’s 1656 Arith- 
metica Infinitorum, Wallis and Stedall (2004), discusses the evolution of Wallis’s 
ideas and the influence of his book on his contemporaries and mathematical heirs. 
The article by Stedall in Grattan-Guinness (2005) may also be helpful, especially for 
its insight into how Wallis’s work influenced Newton. The fruitful collaboration of 
Wallis and Brouncker is also the subject of two interesting notes by Stedall (2000). In 
his Cambridge thesis, Whiteside (1961b) reconstructed Wallis’s attempt to recreate the 
continued fraction formula communicated to him without proof by Brouncker. This 
thesis is an informative and perceptive resource on seventeenth-century mathematics. 
Brezinski (1991) is a very useful book; it contains excerpts from original works 
accompanied by interesting historical commentary. 

Surprisingly, a translation into English of Euler’s De Fractionibus Continuis, 
Dissertatio appeared in the applied mathematics journal Mathematical Systems Theory 
(1985); the editors requested this translation, since they thought Euler’s discussion of 
Riccati’s equation could be useful to their readers. Khrushchev (2008) contains an 
English translation of Euler’s De Fractionibus Continuis, Observationes. Khrushchev 
gives a systematic and well-organized summary of the work on continued fractions 
by Wallis, Brouncker, Huygens, the Bernoullis, Euler, Lagrange, Gauss, Chebyshev, 
Stieltjes, and others. Khrushchev illustrates the process by which the ideas of earlier 
researchers in continued fractions have evolved into important modern theories, such 
as that of orthogonal polynomials. 
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The Binomial Theorem 


4.1 Preliminary Remarks 


The discovery of the binomial theorem for general exponents exerted a tremendous 
impact on the development of analysis, especially the theory of power series. It also 
led to an understanding that an exponential function was defined by the property 
f(a+b) = f(a)f(). The binomial theorem was pivotal not only in the initial 
discovery of series for other important functions but also in the eventual consolidation 
of the foundations of analysis as a whole. The development of the theorem is 
particularly fascinating because it was independently found by both Newton and 
Gregory; because of the various approaches to its proof, including one by Euler; and 
because the validation of these proofs elicited the efforts of the best mathematicians 
of the nineteenth century. 
The binomial theorem for a positive integer exponent n states that 


(a+b)* =a" + Ata"“|b + Ata" 7p? 4...4 A" ab"! + b*, (4.1) 
where the coefficients A/ satisfy the additive rule 
AR = AMT + ART, (4.2) 
and the multiplicative rule 


«co B= Ye Gk + 1) 
k 1: Dek : 


(4.3) 
where it is understood that Aj = 1. We here use a notation unusual today, because the 


k 
wish to understand how these coefficients developed over time. Now we note that the 
additive rule (4.2) is not difficult to obtain. In terms of the notation used in (4.1), we 
can write 


; n P 
notation ( ) or C Le or Cy,~, may be suggestive of recent developments, whereas we 
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(a+b) T_oan 14 At In bps + AM a” pe! 
+ AR lank lok 4... 4 Bt, 


Multiplying both sides by a + b, we have 


(a+b)(a"-1 4... 4 ARG Eph! 4 at lank pk 4... 4 pt!) 
=a" 4... 4 Ata" kok 4... 4 bP, 


Equating the coefficients of a”~*b* on each side, we obtain (4.2). 
The multiplicative rule is somewhat more difficult to obtain. Observe that 


(a+b)" =(at+bj(a+b)--- (a+b), 


where there are n factors (a+b). To find the coefficient of a”—*b*, note that for a”—~*b* 
we must take b from k of the factors a +b; the remaining n —k factors a+b contribute 
n — k of the a’s. Thus, we see that the coefficient of a”~*b* represents the number of 
ways k b’s can be chosen from n factors a + b. This number is given by the right-hand 
side of (4.3). 

The binomial theorem has a complicated history. Some of its components can be 
traced back to the third or second century BCE, to Pingala’s Sanskrit Chandas sutra! 
(also Chhandas sutra) or Prosody aphorisms. Pingala most probably lived in the third 
century BCE. In the eighth and final chapter of his work, he dealt with the construction 
of tables (or prastara) of all possible sequences of n syllables, each syllable being 
either short (laghu = 1) or long (guru = g). We call a sequence of n syllables a meter. 

Pingala’s rule for the construction of a table (or prastara) of all possible meters of n 
syllables was to place n g’s, or n long syllables, in the first row. After k rows, the k+1th 
row would be constructed by entering g’s in each position until the first g appeared 
directly above in the preceding row, when / (representing a short syllable) would be 
entered. The k + 1th row would then be completed by making entries identical with 
the ones directly above. We present a table of the possible combinations of long and 
short syllables for meters of three syllables, and a corresponding table in which we set 


g =Oand/ = 1. 


& 8 8 000 
lgg 100 
glg 010 
ll g 110 
ggl 001 
Lgl 101 
gill 011 
lll 111. 


! Sridharan (2005) especially pp. 47-59. 
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Pingala gave a rule for the number of the row in which a given syllable sequence, 
a given meter, would be found. He also gave a rule predicting the exact n-syllable 
sequence to be found in a given row. If g is set to be 0 and / to be 1, as in our table for 
words of three syllables, then Pingala’s rules amount to writing the numbers in binary 
notation. Thus, the fifth row reads g g / because 


5-1=4=0-1+0-2+1.-2? 


and conversely. 

Pingala proceeded to raise and answer questions on combinations: how many 
meters of a given length have one long syllable, two long syllables, and so on? 
What is the total number of meters of a given length? Although the sutras containing 
the answers to such questions appear to us somewhat obscure, later prosodists have 
elaborated on these, clarifying them for us. In his tenth-century Mritasanjivani, a 
commentary on Pingala’s Chandas sutra, Halayudha explains the answer to the 
question concerning the number of long syllables contained in meters of a given 
length:? 


Draw a square. Beginning at half of the square, draw two other similar squares below it, below 
the two, three other squares, and so on. By putting one in the first square, the marking should 
be started. In the two squares of the second line, put 1 in each. In the third line put 1 in the two 
squares at the ends and in the middle square the sum of the digits in the two squares lying above 
it. In the fourth line put one in the two squares at the ends. In the middle ones put the sum total 
of the digits in the two squares above each. Proceed on in this way. Of these the second line gives 
the combinations with one syllable.... The third line gives the combinations with two syllables 
and etc. 


I 
if 
[2] 
[i f4]o]4]i | 


1[s]iofi0]s [i (4.4) 


Thus, (4.4) gives six lines of the Meru Prastara, the Sanskrit name for Pascal’s 
triangle. Using this table, Pingala could determine, for instance, that the number 
of meters of length n was 2”. The sixth-century mathematician Varahamihira also 
clearly described the procedure for constructing this triangle. Observe that the method 
for constructing the Meru Prastara yields the additive rule for binomial coefficients. 
Another commentator from the tenth-century, Bhattopala, gave the additive rule and 
also the multiplicative rule: “Putting down (the figures [numbers]) once in the reverse 


2 Chakravarti (1932) p. 83. 
3 ibid. p. 85. 
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order, put them below again in the direct order. (In finding the final result) multiply 
the numbers in the process, i.e., from left to right and divide by the corresponding 
numbers below.” Thus, the rule specifies that we write the numbers as 


n (n—1) (n—2) -:- 1 
1 2 3 7) 


so, to find the number of meters of syllables and with 3 guru or long syllables, one 
would write 


n-(n—1)-(n—2)+1-2-3. 


Since each row of Pascal’s triangle is created by adding two consecutive numbers 
from the previous row with just one from the first row, we can deduce that Pingala 
understood the additive property of the binomial coefficients. Before Bhattopala, 
the multiplicative property was presented by Mahavira around 850 A.D. and may 
have been known by Indian mathematicians before that, since Mahavira was heir 
to a thousand-year line of Indians of the Jaina tradition researching combinatorial 
questions.* 

In 628, the Indian mathematician Brahmagupta had explicitly stated the binomial 
formula (4.1) for n = 3 and immediately applied this to find the cube root of a given 
number”. It appears that Brahmagupta would have been able to write down the formula 
for higher values of n, based on Pingala’s rule, and certainly Mahavira could have 
worked out the formula in general. 

In chapter 13 of his 1356 work, Ganita-Kaumudi, Narayana Pandita gave the 
binomial theorem. The topic of this chapter was combinatorial problems and Narayana 
begins the chapter thus: “For the pleasure of mathematicians, [I] now describe briefly 
anka-pasa [sequences of numbers or combinatorics] where bad, wicked and intoxi- 
cated mathematicians’ vanity shatters. The knowledge of anka-pdsa is very useful in 
dramatics, prosody, medicine, garland-making, architecture, and mathematics.”° 

In sutras 36-39 of chapter 13,’ Narayana extends the idea of meru (Pascal’s 
triangle) to sumeru, a table in which the binomial coefficients were multiplied by 
appropriate powers of a constant. With s denoting this constant, a sumeru could be 
written as 


1 

s 1 

s? 2s 1 

s> 3s? 35 1 

So ds? G5 Ag: 1; (4.5) 


4 See Datta (1929). 

5 Brahmagupta (1817) p. 279. 

6 Narayana Pandita (2001) p. 23. 
T ibid. p. 33. 
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Thus, adding the quantities in a given row of (4.5) would yield an expression for a 
power of (s + 1), with the first row giving the Oth power, the second row giving first 
power, and so on. 

After describing this construction, Narayana specified that this method for the 
formation of sumeru had been given by the learned mathematicians. In sutra 67,° 
Narayana wrote that the sums of the rows of the sumeru were in geometric progression. 
Thus, in our notation, the sum of the rows would be (s + 1)”, n = 0,1,2,3,4... and 
in this way, Narayana stated the binomial theorem. 

Now in the 1261 work of Yang Hui, A detailed analysis of mathematical methods in 
the nine chapters and their reclassifications,° Pascal’s triangle is presented up through 
the sixth row, or by counting the zeroth row as the first, up through the seventh 
row. However, Yang Hui explained that his Pascal’s triangle had appeared earlier, !? 
in the work of the eleventh-century mathematician Jia Xian, and that Jia Xian had 
applied it to calculate the roots of numbers up through the fifth root by a method 
he had devised,!! called “the method for extracting roots by iterated multiplication.” 
This method could be applied not only to equations of the form x” = N, but also 
to general polynomial equations f(x) = 0. Jia Xian’s method is akin to Horner’s 
method for finding approximate solutions of algebraic equations. Jia Xian’s works, 
unfortunately, appear to have been lost and his contributions are now known through 
the attributions of Yang Hui. Clearly, the extraction of the fourth and fifth roots would 
require the binomial theorem for n = 4 and n = 5; thus, Jia Xian appears to have been 
aware of the binomial theorem. In addition, Pascal’s triangle through the eighth row 
may be found in Zhu Shijie’s 1303 book, Siyuan Yujian or Jade Mirror of the Four 
Unknowns.'* 

The algebraist al-Karaji (953-1029) apparently lived in Baghdad during his most 
productive period.!? In his book Al-bahir, al-Samawal (c. 1130-1180) attributed the 
additive law of binomial coefficients as well as the expansion of (a + b)” to al-Karaji. 
In addition, al-Samawal showed how the expansion for (a+b)? implied the expansion 
for (a + b)?, which in turn implied that of (a + b)*. The argument he used was a type 
of induction also used by Euler and Lagrange; al-Samawal attributed the discovery of 
this type of argument to al-Karaji. In his surviving work al-Fakhre, al-Karaji gave the 
expansion for (a + b)? and in his al-Badi, he presented expansions for (a — b)* and 
(a + b)*. Al-Kashi, who died in 1429, gave Pascal’s triangle up through the ninth 
power; he was aware of the additive and multiplicative rules for the binomial 
coefficients. 

In fact, in a lost work of around 1100, the noted poet Omar Khayyam apparently 
presented a method for finding the fourth, fifth, and higher roots of a given number. '¢ 
He wrote, “I have composed a book demonstrating the soundness of these methods 


oo 
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leading to the discovery of required values and I have added methods for the solution 
of various other types. ... I refer to the extraction of the sides of the square of a square 
the square of a cube, and the cube of a cube, etc. ... all of which is new. These proofs 
are arithmetical.” These results would certainly suggest knowledge of the binomial 
theorem. 

Al-Zanjani (d. 1262) gave a method for finding (a+ b+c¢+---)":) 


We have concerned ourselves with the expression consisting of two terms because those which 
consist of three, four, or more terms are nothing but special cases of the two terms. Don’t you see 
that if you want to find the cube of a three term expression you combine two of them into one? 
That is, combine the first two and raise it to the third power. Also raise the third term itself into a 
cube. Multiply the third term by the square of the sum of the first two thrice. Then multiply the 
sum of the first two by the first two by the third thrice. The sum is the final answer to the original 
one. Follow the procedure for all the other powers. 


More than a hundred years before Pascal wrote his 1654 treatise on his triangle, 
Pascal’s triangle was surely known in Europe. In 1527, the German mathematician 
Petrus Apianus published his Arithmetic, and on its title page he gave Pascal’s triangle 
up through the ninth power. This triangle apparently became a part of received 
knowledge after the Italian mathematician Niccolo Tartaglia wrote his General 
Trattato of 1556. Nevertheless, it certainly appears that Newton was not familiar with 
the binomial theorem for positive integral exponents at the time he discovered his 
theorem of rational exponents. 

Newton discovered the general binomial theorem in the winter of 1664—65,!° while 
he was still a student at Cambridge. He was motivated by this discovery to develop 
his “method of infinite series” and apply it to several important problems. Indeed, 
the binomial theorem played a basic role in his approach to such topics as algebraic 
equations in two variables and differential equations. James Gregory independently 
found this theorem between 1668 and 1670, and it formed an important part of his 
original work on infinite series. !7 

Newton discussed particular cases of his theorem in two papers written in 1669 
and 1671. However, the first explicit statement of the general theorem for rational 
exponents appeared in a letter from Newton to Oldenburg, dated June 13, 1676. This 
letter was a response to an inquiry from Leibniz, who had learned of Newton’s series 
for arcsinx and sinx from the Danish mathematician Georg Mohr. Newton’s letter 
also introduced his new notation for exponents, as he explained:!8 


These are the foundation of these reductions: but extractions of roots are much shortened by this 
theorem, 
m— 2n m—3 


2n 3n 4n 


m m m 
(PAPO SP AO 


where P + PQ signifies the quantity whose root or even any power, or the root of a power, is 
to be found: P signifies the first term of that quantity, Q the remaining terms divided by the 


15. ibid. p. 404. 

16 Newton (1967-1981) vol. 1, pp. 104-108. 
17 Turnbull (1939) p. 131. 

18 Newton (1959-1960) vol. 2, pp. 32 and 42. 
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first, and @ the numerical index of the power of P + PQ, whether that power is integral or (so 
to speak) fractional, whether positive or negative. For as analysts, instead of aa,aaa, etc., are 


. : 1s 3 
accustomed to write a’,a>, etc., so instead of Ja,Va3, Je : a, etc. I write a2,a2,a3, and 


instead of a ps + I write a7),a7?,a7?. And so for 
aa 
Je: (a3 + bbx) 
1 2 
I write aa(a? + bbx)~3, and for aab I write aab(a? + bbx)73 --- 


Je(a3+bbx)(a3+bbx) 


In Newton’s formula, A denotes the first term, B the second term, and so on, 
such notation being common at that time. Note also that ./c : x stands for the cube 
root of x. 

Intrigued by Newton’s groundbreaking work, Leibniz responded with some of his 
own discoveries on series and requested details about the origin and derivation of 
Newton’s results, especially the binomial theorem.!° Newton wrote a lengthy reply 
amounting, to nineteen printed pages in his letter of October 24, 1676,2° again 
through Oldenburg. Newton explained that in 1664—1665, he was inspired by Wallis’s 
Arithmetica Infinitorum to consider the integral fe d — 12)2 dt and to expand the 
integrand. He looked at the absolute values of the coefficients of the polynomials 


(1 —x?)®=1, 1—x’)! =1-2x?, 1 —x?)? = 1-22? + x14, 


(1 —x*)? = 1 —3x7 + 3x4 —x®, (1 —x?)4 = 1 — 4x7 + 6x4 — 4x9 4-28, 


and asked how the (absolute) values of the first two coefficients of any of these 
polynomials could produce the remaining coefficients:7! 


found that on putting m for the second figure [coefficient], the rest could be produced by a 
continual multiplication of the terms of this series, 


For example, let m = 4, and 4 x 5(m — 1), that is 6 will be the third term, and 6 x 5m — 2), 
that is 4 the fourth, and 4 x }(m — 3), that is 1 the fifth, and 1 x (mm — 4), that is 0 is the sixth, 


at which term in this case the series stops. According, ..., for the circle, ..., I put m = 5 and the 
terms arising were 


and so to infinity. 


Thus, Newton learned how to generate the binomial series when the exponent was 
any number m and, by taking m = > he obtained the expansion 


19 ibid. pp. 57-71. 
20 ibid. pp. 110-161. 
21 ibid. pp. 130-131. 
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1 1 5 
ee 2 4 6 8 
Ce. a 3 6 198” 


from which he derived the value of the integral ih ad — 12)3 dt as an infinite series. 

It is curious that Newton was unaware of the work of Briggs, Pascal, and others on 
the multiplicative formula for binomial coefficients. It seems that the mathematical 
texts Newton studied as a student did not contain the multiplicative formula. In 
fact, Wallis wrote in 1685 that he had not known this formula when he wrote 
his Artithmetica Infinitorum.”* This is surprising because this work included the 
multiplicative expression for figurate numbers, intimately connected with binomial 
coefficients. In any case, Wallis’s book was apparently sufficiently suggestive for 
Newton to make his discovery about C,,; for integral n and then extend it to 
fractional n by following Wallis once again. Newton attempted to verify his theorem 
by the interpolation methods he had learned from Wallis*? but he soon found more 
satisfactory techniques, described in his letter of October 24, 1676.74 


For in order to test these processes, I multiplied 


1 1 1 
1 se at" et etc. (4.6) 


into itself; and it became 1 — x”, the remaining terms vanishing by the continuation of the series 


to infinity. And even so 1 — gx? - 5x4 - an etc. multiplied twice into itself also produced 


1—x”. And as this was not only sure proof of these conclusions so too it guided me to try whether, 
conversely, these series, which it thus affirmed to be roots of the quantity 1 — x2, might not be 
extracted out of it in an arithmetical manner. And the matter turned out well. This was the form 
of the working in square roots. 


After getting this clear I have quite given up the interpolation of series, and have made use of 
these operations only, as giving more natural foundations. 


Newton realized that all the algebraic operations could be applied to infinite series 
and that series could be viewed as the algebraic analogs of infinite decimals. Just as 


22 Wallis (1685) pp. 318-320. 
23 Newton (1967-1981) vol. 1, pp. 106-107. 
24 Newton (1959-1960) vol. 2, pp. 131-132. 
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the latter appear when division and root extraction are performed on integers, infinite 
series result when these operations are performed on polynomials. In the preceding 


quote, Newton explained that when he applied the square root algorithm to 1 — x?, 


: I se cchals 
the result was the series for (1 — x)?. For division, Newton gave the example of the 
geometric series 


1 1 e e e 
aa = a + etc. (4.7) 


In searching for the proof of the binomial theorem, Newton looked no further than 
a few cases, and he verified these by multiplication. We shall see that this method is 
the basis for one proof of the binomial theorem, due to Euler.”° 

James Gregory first revealed his discovery of the binomial theorem in a letter to 
his longtime correspondent, John Collins. First, on March 24, 1670, Collins wrote 
to Gregory, mentioning some mysterious work done by Newton:7° “Mr Newtone of 
Cambridge sent the following series for finding the Area of a Zone of a Circle to 
Mr. Dary, to compare with the said Dary’s approaches, putting R the radius and B 
the parallell [sic] distance of a Chord from the Diameter the Area of the space or Zone 

7 
3R OR? 65 S6R 
integral 2 tig VR? — x* dx. We note that Newton obtained the series by expanding 
the integrand as a binomial series and then doing term-by-term integration. 

Gregory then formulated the binomial theorem in a letter to Collins of November 
30, 1670,2’ stated as the solution of a problem: Use the numbers b, b+d and the values 
of their logarithms, e and e + c, respectively, to find the number whose logarithm is 
e +a. Gregory wrote that the desired number was given by the series 


betweene [sic] them is = 2RB 


” This area is given by the 


a aast). a... dla Ola =20) & 
b+-—-d4 + etc. 4.8 
c c:2c b c+ 2c-3c a of) 


In (o(1 sf s)) ae (£)(mo+a - nb) (4.9) 
Cc 


=e+(2)etce-o=era, 
Cc 


Since 


we see that the series is the binomial expansion of b (a + ¢) © In this letter, Gregory 
also stated his general interpolation formula; the manner in which he stated this 
formula and the binomial theorem suggests that the latter was derived from the former. 

In spite of the fact that he had already found the binomial theorem, it was not 
until December 1670 that Gregory could perceive the origin of Newton’s series. 
Gregory explained in a letter of December 19 to Collins, that he had derived numerous 


25 Bu. I-15 pp. 207-216. E 465. 
26 Turnbull (1939) p. 89. 
27 ibid. pp. 118-137. 
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series for the circle and had mistakenly expected Newton’s to be a corollary of at 
least one of them. He added, “I admire much my own dulness [sic], that in such a 
considerable time, I had not taken notice’”2® that Newton’s series followed from a 
binomial expansion. 

Note that the interpolation formula can be written as 


—]l —2 
f(x) = fO) +xAf(0) + eee x(x ue ) 


A’ f (0) 4 AF f(O) +--+; 


in this connection, see Section 9.3. 
To derive the binomial theorem, take 


ray =o(14$) 


so that 
afisy= F040 F0)=0(14 5 . 
d\* d 
A? f (x) = Af (x + +p-ara)=o(1+ Bl ige 
d\* @& 
aio) =6(14 5) - 
Thus, 
a 
Af) =d, A’ f(0) = — "a? f(0) = pe 


and we get Gregory’s series (4.8) by taking x = © in the interpolation formula. 

Interestingly, Gregory’s derivation is logically more sound than Newton’s original 
argument by a Wallis interpolation. Yet the two derivations both involve interpolation 
with respect to the exponent. In spite of their highly imaginative and useful work in 
this area, both Newton and Gregory failed to give well-founded derivations of the 
binomial series. Eighteenth-century mathematicians made very interesting attempts 
to fill this gap, but it took until the nineteenth century to find a completely rigorous 
derivation. 

In the eighteenth century, it was generally known that the binomial expansion for 
f(x) = (1 + x)* could be obtained as the series solution of the equation 


d 
itty Sey (4.10) 
dx 


28 ibid. p. 148. 
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Of course, whether a series could be differentiated term by term and whether 
a differential equation had a series solution were not then seen as problems. The 
point that bothered the English mathematician John Landen (1719-1790) was that the 
proof used derivatives (fluxions) to obtain a result in algebra. In his 1758 Discourse 
Concerning the Residual Analysis,”? Landen applied the algebraic identity 


(4.11) 


| 

T 

m 2m 3m m? 
a (a—)) iy 


t+ qn f++++@q 


where q = 2 and m and n were integers, to avoid differentiation. 
Euler took a different approach, presenting a proof using Newton’s idea of 
multiplication of series.*° He showed that if 


gg? im mal 5. 
f(m) = 14 peta Se etc., then 
f(m +n) = f(m)- f@). (4.12) 


His proof consisted in demonstrating that the coefficients of x* on both sides of 
equation (4.12) were the same. This was sufficient to derive the binomial theorem for 
rational exponents, except that he did not address convergence questions, particularly 
in the case of the product of two series. We must note that seventeenth- and eighteenth- 
century mathematicians had more or less clear ideas of convergence of series, but only 
occasionally did they apply these ideas to the series arising in their work. As examples 
of rigor, Grégoire St. Vincent (1584-1667) gave an entirely rigorous treatment of 
the geometric series in his Opus Geometricum of 1647. Twenty years later, Wallis 
discussed the logarithmic series*! with a careful analysis of the remainder term, 
obtainable from the remainder in a geometric series. Leibniz gave a clear account 
of an alternating series in which the terms decrease to zero; he wrote to Jakob 
Hermann about this in a letter of June 26, 1705,°2 although he had done this work 
two decades earlier. Later, on January 10, 1714, in a letter to Johann Bernoulli,?? 
Leibniz discussed alternating series, among other topics. In addition, he included his 
alternating series theorem as proposition 49 of his De quadratura arithmetica circuli 
ellipseos et hyperbolae, the first full and accurate publication of which appeared only 
in the twentieth century, thanks to the diligent efforts of the Leibniz scholar Eberhard 
Knobloch.*4 

Some gems from the eighteenth century include Stirling’s 1719 criterion for 
convergence* based on second differences of the terms of a series (though it required 
an amendment) and Maclaurin’s statement and proof of the integral test in his 1742 
Treatise of Fluxions. Moreover, d’ Alembert made some comments on the convergence 


29 Landen (1958) pp. 5-7. 

30 Eu. I-15 pp. 207-216. E 465. 

31 Wallis (1668). 

32 Leibniz (1971) vol. 4, pp. 272-275, especially p. 273. 

33 Bernoulli and Leibniz (1745) vol. 2, pp. 329-331, especially p. 330. 
34 Leibniz and Knobloch (1993) p. 115. 

35 Stirling (1719), especially pp. 1067-1070. 
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of series from which the ratio test can be developed.*® Also of note is Edward Waring’s 
1776 statement without proof of the ratio test, now known as Raabe’s test, from which 
Waring derived the convergence of the series )°”-_, ao) in this connection, see our 
Section 23.6. We mention that Gauss greatly extended this ratio test in his famous 
work on hypergeometric series. 

To extend Euler’s proof to all real exponents, it was necessary to give a precise 
definition of continuity. Bernard Bolzano (1781-1848) and A. L. Cauchy (1789-1856) 
independently accomplished this. Bolzano was a professor of theology at Prague; his 
main interests were in philosophy and mathematics. He defined continuity in an 1817 
paper*® on the intermediate value theorem: A function f(x) varies according to the 
law of continuity for all values of x inside or outside certain limits if the difference 
ft (x+w)-— f(x) can be made smaller than any given quantity, provided w can be taken 
as small as we please. Bolzano’s definition leaves little to be desired. 

Cauchy emphasized rigor in analysis from the very beginning of his teaching career 
at the Ecole Polytechnique. His published lectures from 1821 and 1823 discussed the 
concepts of limits, continuity, and convergence. His 1821 lectures Analyse algébrique 
gave his definition of continuity, not quite as good as Bolzano’s: The function f(x) 
will be a continuous function of the variable x between two assigned bounds if, 
for each value of x between these bounds, the numerical value of the differences 
f(x +a) — f(x) decrease indefinitely with a. 

Cauchy derived the continuity of the series f(m) in (4.12) from the erroneous result 
that if every term of an infinite series is continuous and the series is convergent, then 
the series is continuous. In fact, in his 1826 paper on the binomial theorem,” Abel 
noted that Cauchy’s theorem on the continuity of a series admits of exception. For 
example, in a footnote Abel wrote that 


sin sin 2¢4 ; sin 3 —--- (4.13) 
was discontinuous for every value (2m + 1)z of ¢, where m is a whole number. 
Abel then proceeded to state and prove his famous continuity theorem for power 
series, using the method of summation by parts. This method had been known for 
over a century, but Abel was the first to apply it to problems of convergence of series. 
Dirichlet profited from Abel’s paper and used these ideas very effectively in his study 
of L-series less than a decade later. Interestingly, Abel gleaned ideas of mathematical 
rigor from Cauchy’s lectures, obtained from his friend Crelle’s library; Abel’s paper 
appeared in Crelle’s newly founded journal. 

The concept of uniform convergence was implied in Abel’s continuity theorem, but 
its explicit formulation came later. First, C. Gudermann observed in an 1838 paper*® 
on modular functions, published in Crelle’s journal, that he had obtained a certain 
series having the same convergence rate for all values of the variable. A year later, 


36 See Grabiner (1981) pp. 60-64. 

37 See Gonzalez-Velasco (2011) p. 391. 

Bolzano (1980) contains an English translation of this paper by S. B. Russ. 
39 Abel (2007) pp. 105-138, especially p. 111, footnote 3. 

40 Gudermann (1838) pp. 251-252. 
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K. Weierstrass was the only student in Gudermann’s course on modular functions. 
Weierstrass introduced the term uniform convergence, understood its importance, and 
gave its definition in an 1841 paper “Zur Theorie der Potenzreihen,’*! submitted 
as part of his examination for teaching certification. Gudermann declared, “The 
candidate hereby enters by birthright into the ranks of discoverers crowned with 
glory.’4* Unfortunately, the paper was not published until 1894. During the winter 
of 1859-60, Weierstrass lectured at the University of Berlin on the foundations of 
analysis, but it took some time before his ideas spread to other European countries 
and to America. In a letter of 1881 to his former student Hermann A. Schwarz, 
Weierstrass observed that people in France were finally grasping the importance of the 
idea of uniform convergence. Finally, in the last two decades of the nineteenth century, 
textbooks containing Weierstrassian ideas appeared in several languages. The British 
mathematical physicist G. G. Stokes*? (1819-1903) and Dirichlet’s student P. Seidel 
(1821-1896), also wrote on concepts related to uniform convergence, though their 
papers did not have much influence. 

Interestingly, Cauchy wrote a paper in 185 acknowledging his mistake on 
continuity, noting that it was easy to rectify. He then proceeded to work with uniform 
convergence without naming the concept, so it is not clear whether he fully realized 
the wider significance of the idea. 


345 


4.2 Landen’s Derivation of the Binomial Theorem 


In 1758, John Landen presented the standard eighteenth-century derivation of the 
binomial theorem,*° in which one assumes the series expansion 


(l+x)n =1+ax+bx* +cx> + dx* + etc. (4.14) 


Now take the derivative of each side to get 


(T)a+am —a+t2bx +3cx? + 4dx? + ete. (4.15) 
n 


Multiply the last equation by 1 + x to see that 


m 2 3 2 3 
( Ja tax + bx“ +cx”? + ete.) = (1+ x)(a + 2bx + 3cx* + 4dx~ + etc.). 
n 


Equate coefficients to obtain 


41 Weierstrass (1894-1927) vol. 1, pp. 67-74. 
42 Klein (1979) p. 263. 

43 Stokes (1849). 

44 Seidel (1847). 
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so that 
p= EE) 
p= MED -9) 
We een eee) 


7 sh clsely 


proving the binomial theorem. Landen also gave an alternative method, avoiding 
differentiation, by starting with (4.14) and applying (4.11) to get 


I m I m j ieay cae oe eee (2) = 
+x)n —(1+y)n 7 v Thx 7 ATS 
( a ( y) =(14 x)a! x a e 


ee Ey 1-4 It+y\)7 ifs oe, iL I+y ie 
" \ 14x ' " \ T+x 


=a+b(x+y)+cQ? +ay+ y)+d(x3 4+ x2y + xy? + y*) + ete. 


He then observed that the last equation is an algebraic identity true for all values of 
y and so that he could take y = x to obtain (4.15). 

Almost seven decades later, Abel objected to differentiation in this context, not 
because he perceived the binomial series as algebraic but, as he wrote from Berlin 
to his friend and former teacher Holmboe, he thought it impermissible to apply 
operations on infinite series as if they were finite.47 He noted that it had not been 
proved that the derivative of an infinite series could be obtained by taking the 
derivative of each term, and that there were numerous counterexamples. For example, 
he observed, the sum of the series (4.13) was g in the interval —z < @ < z. Taking 
derivatives gave 


1 

- cos @ — cos 2¢ + cos 3¢ — etc., 
a clearly false result, because the series was divergent. In 1841, Weierstrass finally 
addressed Abel’s concerns when he developed the theorems for differentiation and 
integration of series.48 


4.3 Euler: Binomial Theorem for Rational Exponents 


In 1773, Euler presented his paper, published in 1775, “Demonstratio theorematis 
Newtoniani”*? to the Petersburg Academy, giving a proof of the binomial theorem. 


47 Abel (2007) pp. 482-487, translation by Horowitz. 
48 Weierstrass (1894-1927) vol. 1, pp. 67-74. 
49 Bu. 1-15 pp. 207-216. E 465. 
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In the first section of this paper, Euler wrote that this theorem stood as a foundation 
for the whole of higher analysis, so that a rigorous proof was clearly called for. 
He explained that to avoid circularity, he wished to give a demonstration not using 
differentiation, since he had used the binomial theorem in his differential calculus 
book of 1755 to show that the derivative of x” is nx”~!.°° 

Eighteenth-century mathematicians used differentiation in some form to find the 
binomial expansion. They also assumed that (1 + x)”, where n was any real number, 
was expandable as an infinite series; for example, Landen had made this assumption. 
In this paper, Euler avoided this difficulty. 

Moreover, in 1763, the German scientist Franz Aepinus published an inductive 
proof for positive integral exponents in the Petersburg Academy journal.*! Euler 
thought that the argument, while ingenious, was quite obscure. 

Euler started by observing that since (a + b)” = a” (1 + 8 ie it was sufficient to 
obtain the expansion of (1 + x)”. He set 

m mm-—il 4, 


[m] = 14 co i. oS x“ + etc., (4.16) 


with the aim of proving that [m] = (1 + x)” when m was a fraction. Note that he 
already knew that the result was true when m was a positive integer. The important 
step in his proof was to show that [m] - [n] = [m + n]. He wrote 


1 
[fn] =1+ 5x47. 95 x? + etc., 
so that 
m mm—l1 5 
[m]-[n] = 14 a ae x“ + etc. 
a m no | 
+ —Xx -—x* + etc. 
1 1 1 
nn-l 95 
Car 5 x“ + etc. 


Thus, the product had the form 1 + Ax 4 Bx? + Cx? + etc., where 


nn—n mt+tn—Il mtn m+n-—1 
A=m-+n, B= + mn 4 = . 
2 2 1 2 


Euler then observed that it was very laborious to compute C, D, E, etc. by this method. 
To see in modern terms what was involved, write the coefficient of x* in [m] as ( k ), 


that is, 


(4.17) 


m _ mm —1)---(m—k+1) 
(i)= oo 


50 Bu. 1-10), chapter 5. E 212. 
51 See Eu. I-15 p. 208, footnote 2, for a reference to Aepinus. 
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Thus, the coefficients are A = (” . 5) B= & z y So we expect, and Euler 


had to show, that the coefficient of x* in [m] - [n], that is, 


(VO-U2I=G)O) 


(ag (m+n)\(m+n—1)---(m+n—k4+1) 
ke k! 


is equal to 


(4.19) 


He noted that it was safe to conclude that his method showed that the expressions 
for the coefficients A, B, C, etc., depended upon m and n but did not require m and n 
to be integers. He then observed that when m was an integer, 


ni-(B)o(9)e0(Q)eroe (rower 


Thus, when m and n were positive integers, 
[m] -[n] = (1 +x)"- +x)" =(+x)"™" = [m +n], (4.20) 


so that the coefficient of x* was given by (4.19). Hence, Euler argued, (4.19) gave the 
expression for the coefficient of x* for any real m and n. He was in effect arguing that 
if the equation 


OGIO OCH)» 


where é was defined by (4.17), held true for all positive integers m and n, then 


it must also hold for any pair of real numbers m and n. Though his argument was 
sufficiently persuasive to his contemporaries, including Legendre, and although he 
applied the same argument in other situations, we observe that it is quite incomplete, 
an early example of Peacock’s principle of permanence of equivalent forms. We return 
to this topic later in this section. To continue with Euler’s proof of the binomial 
theorem, we suppose with Euler that he had proved that 


[m]-[n] = [m +n] (4.22) 


was true for all real m and n. Euler could then deduce the binomial theorem for rational 
exponents. He supposed m = : where p and q were positive integers. Then by (4.20) 


vooraine[Eetoet] ELLE). 
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or 


[=| (Tg) 2, (4.23) 


proving the theorem for positive rational exponents. Euler extended this result to 
negative rational exponents by noting that 


[m]-[—m] = [m —m]= [0] =1 


and therefore 


P 
q 


By (4.23), this meant that (1 + x) [-2 = | and thus [-2| = (1+.x) ¢. Euler 


did not discuss convergence questions here. He certainly ey that the series for [7m] 
converged when |x| < 1, but apparently he had not given thought to the more subtle 
questions related to the convergence of the products of infinite series. Cauchy, Abel, 
and Dirichlet eventually addressed such issues. 

The gap in Euler’s reasoning, a full proof for (4.21), was addressed by Cauchy 
in his 1821 Analyse Algébrique.>? In section 4.1 of this work, he first proved that 
if two polynomials of one variable and of degree n — 1 were equal at n different 
values, then they were identical. This proof is contained in our Section 2.10. In the next 
section of his book, Cauchy extended this result to polynomials in many variables. For 
polynomials in two variables x and y, the theorem can be stated: If P(x, y) and Q(x, y) 
are two polynomials of degree n — 1 in x and y that become equal when x takes one 
of the values xo,x1,...,Xn—1, and y takes one of the values yo, y1,..., Y,—1, then the 
two polynomials are identical. Cauchy proved this result by arguing that P(xo, y) and 
Q(xo, y) were two polynomials of degree n — 1 in one variable, y. He reasoned that 
since these polynomials were equal for n values, y = yo, y1,..-,¥n—1, they were then 
identical. Similarly, fori = 1,2,...,n —1, 


P(xj,y) = Oj, y). 


Now for any real value of y, P(x, y) = Q(x, y), for x = x0,x1,...,Xn—1. Thus, 
Cauchy could conclude that P(x, y) = Q(x, y). 

Cauchy went on to apply this theorem to prove (4.21), observing that both sides 
were polynomials of degree k in two variables m and n; furthermore, since these 
polynomials were equal for infinitely many integer values of m and n, they were 
identical. 

As we discuss in Chapter 17, Euler once again saw no need to validate his move 
from integers to all real numbers in the case where the function was not a polynomial. 


52° Cauchy (1989). 
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4.4 Cauchy: Proof of the Binomial Theorem for Real Exponents 


In his lectures at the Ecole Polytechnique, Cauchy attempted to put the work of his 
predecessors on a more solid foundation,>? although his students did not appreciate 
his efforts and apparently complained about it. In order to make Euler’s work of 
the previous section more rigorous, Cauchy had to define a continuous function, and 
he also needed to work out the definitions of convergence, absolute convergence, and 
the product of infinite series. 

Cauchy considered the infinite series 


Mo + Mit w2+::: 


by setting the nth partial sum 


Sp = Mo + M1 + M2 + +++ + Mn-t- 


Cauchy then stated that if s, approached a fixed limit as n increased indefinitely, 
then the infinite series would converge; otherwise, it diverged. The limit of s, was said 
to be s if for any small number e, the value of s,, fell between s — € and s + €, for large 
enough n. 

He stated and proved the ratio test for convergence of a series and deduced that the 
binomial series for |x| < 1 converged. He also defined what is known as the Cauchy 
product of two series 


ug +tuytug+--: and vo+vy+v2+--: 


as 


Uugvg + (Ugvy + U1 V0) + (Ugv2 + u4v, +U2V0) +--- 


+(Ug¥p + Uy Up—1 ++++ + unvo) +--+: . 


Here note that if we define the degree of a term u; vj; to be i+ j, then ugup would be 
of degree 0, ugv; + v1 v9 would contain all the terms of degree 1, ugv2 + uj v1 + u20U9 
would contain all the terms of degree 2, and so on. 

Cauchy then proved that if the two series were absolutely convergent and converged 
to s and s’, then the product series converged to s” = ss’. For the case in which all u 
and v were positive, Cauchy observed that 


/ / 
Sm+1Simn41 < Sp < SnSp) (4.24) 


where m = ut for all n odd and m = Le for n even. To quickly check the validity 
of these two inequalities in (4.24), take the case when n is even and equal to 2m + 2. 
In that case, s/’ contains every term of degree less than or equal to 2m + 1, while 
Sia 155 41 Cannot contain any term of degree greater than 2m. The same approach 
applies for n odd. 


53 Cauchy (1989) § 4.3 and Note 6. 
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The two inequalities (4.24) implied the theorem for series with positive terms, 
taking the limit as m — oo. When the terms ug,uj,uU2,... and vo, 1,02... were 
positive as well as negative, Cauchy first observed that 


! ” 
SnSy, — S, = Un—1Un 1+ Un—1Up—2 + Un—2Up—1) ++ + Un—1U1 +++ + up Up_—1). 


He denoted the absolute values of the u and v by Po, P}, Po, ... and Pj, Pi’, P3,... 
respectively and remarked that, from the result on the convergence of the product of 
series with all positive terms, 


Py—1Pi_y + (Pa-1 Pho + Pa-2Py_y) tes + Pn Pi +2 + Pi Ph_y) 


"| 
’ 


tended to zero as n —> oo. Since this expression was an upper bound for |s,,s/, — s/’ 


it followed that s,s/, — s//) > 0 as n > oo, proving the result. 

With these theorems in hand, Cauchy could close the gaps in Euler’s proof of 
the binomial theorem. The binomial series [m] and [n] in (4.16) were absolutely 
convergent for |x| < 1. Hence by Euler’s argument and Cauchy’s result on products 


of series, [m] - [n] = [m + n]. Finally, if [m] was a continuous function of m, and if 


for integers p and q, al = (1+ x)¢, then [m] = (1+ x)” for all real exponents m. 
Recall that Cauchy’s proof of the continuity of [m] was inadequate, and the gap was 
filled by Abel. 

Cauchy gave another proof of the binomial theorem as a corollary of Taylor’s or 
Maclaurin’s theorem. It was well known in the eighteenth century that the binomial 
theorem could be formally derived from Maclaurin’s theorem, but Cauchy was the 
first to understand how an analysis of the remainder term could be applied to obtain a 
rigorous proof of the binomial theorem for real exponents. Note that Cauchy gave the 
remainder in two different forms, as discussed in our Chapter 11: 


f@)=fO+=f'04 = f"(0) 4 xn fO-DO)ER 
Re =f * Geo “ie tase@r =) n> 
where 
Rn = _*" _ ¢™@x), 0<@ <1, 
1-2---n 
or 
x” 
Ry, = 1 n—1 ¢(n) 0 1. 
(OseG@=): Ay Ff" Oix), OO: = 


When f(x) = (14+ x)", he had 
f@=nuU-V)-(u-k+D04x)"* 
and hence 


FOO _ w—1--@-k+) 
kl kl 
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so the binomial series was obtained. To determine the values of x for which the series 
equaled (1 + x), it was necessary to find the values of x for which R, — 0 as 
n — oo. Taking m large enough that [=| < 1, Cauchy noted* that 


UE A Se) nea 


= use (ES 1 shed I 
= be ) (u-—m+ ) md, | easel wan ee csi (—xy"~", 
ieee eee m+1 n 


and hence for |x| < 1,u,-1 — O0asn — oo. Now the first form of the remainder 
would be 


M(w — 1)---Qu—n+1) = 
Rn = ss Glas ae 
" eee eee ab ex) 


l n 
_1-x(0+6x)4 ; 
Mn-1°x(1 + 6x) (<=) 


n 
) was bounded only for positive x, and he could deduce that 


1 
The factor ( ex 


R, > Oasn — o for 0 < x < 1. Cauchy needed the second form of the remainder 
to be able to deduce the binomial theorem for |x| < 1. Using the second form, Cauchy 
had 


1-6 n—-1 
R= eerie hei |= 
1+06\x 


n 
Clearly, (t=) was bounded for |x| < 1 and R, — Oasn —> o for |x| < 1. 


4.5 Abel’s Theorem on Continuity 


Abel’s continuity theorem was a response to Cauchy’s 1821 result requiring uniform 
convergence, a concept developed later by Weierstrass. Implicitly accounting for 
uniform convergence in his result, Abel proved a theorem yielding the binomial 
theorem for real exponents. Though Abel found a mistake in Cauchy’s work, he 
acknowledged his indebtedness to Cauchy and wrote in his paper that every analyst 
who loved rigor in mathematics should study Cauchy’s Cours d’analyse; we note that 
this work is also called Analyse algébrique. Abel’s basic result on power series is 
now called Abel’s continuity theorem; in modern terms, it states that if an infinite 
series )* a, converges, then the series }° a,x” converges uniformly for 0 < x < 1 
and also tends to }° a, as x tends to 1~. In 1897, Alfred Tauber (1866-1942) proved 
a conditional converse,» leading to the extensive Tauberian theory, developed and 
named by Hardy and Littlewood . 


54 Cauchy (1829) pp. 85-86, 102-105. 
55 Tauber (1897). 
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In his January 1826 letter to Holmboe, Abel discussed his continuity theorem: 


Let ag + ay + ay + 43 + 44+ etc. be any infinite Series and thus You know that a very useful 
Manner of adding up this Series is to seek the sum of the following: ag+a,x + anx? + ayx? +--+: 
and then later, put x = 1| in the Results. This is correct; but it seems to me that one cannot accept 
it without Proof. 


Abel applied his theorem> to show that if A and B were convergent infinite series 
and their Cauchy product C was convergent, then AB = C. Abel’s continuity theorem 
was based on a lemma using summation by parts: If to, f1,...,tm,... denoted a 
sequence of arbitrary quantities, and if the quantity p», = fo +t) +--+ + tm was 
less than a definite quantity 6, then 


r=€ofo tet) +--+ + emtm < 5€0, (4.25) 
where €0, €1, €2,... Were positive decreasing quantities. To prove this result, Abel 
noted that 

r = €0po + €1(p1 — po) + €2(p2 — Pi) +--+ + €m(Pm — Pm-1) 
= po(€o — €1) + pi(e1 — €2) +++ + Pm—1(€m—1 — €m) + PmEm (4.26) 
< d(€9 —€; +E] —€o +--+ + €m_1 — Em + Em) = 5€0. 
The last step in this proof was valid because €9 — €1, €2 — €1,... Were positive. 


Next, for the continuity theorem, Abel wrote that if the series 


f(@) =v9 + va 4 A gee Oma | a 


converged for a = 4, it would also converge for every smaller value of a; likewise, 
f(a — B), for continually decreasing values of 6, would come arbitrarily close to the 
limit f(a), with w equal to or smaller than 5. To prove this, Abel let 


vo + via +++» + Uma"! = o(@), 
Una” + Uma") + coe wa). 


Then 


v= (EY ott + (2) 4 < (2) 


where p was the maximum of 


tind Dig dt db gd Ee Dy 8 ee ks 


56 Abel (2007) pp. 107-112. 
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Note that the inequality followed from his lemma proving (4.25). Then for 0 < a@ < 
5, m could be chosen large enough so that y(a@) = w. We observe that Abel used the 
symbol @ to denote an arbitrarily small quantity. Next, 


f(a) = $(@) + Wa), 


and hence 


f(a) — f(a — B) = o(a) —¢(a— B) +a. 


Since ¢ (a) was a polynomial, 6 could be taken small enough that 6(@)—¢(a—B) = w 
and hence f(a) — f(a — B) = a, proving the theorem. 
To address the defect in Cauchy’s proof of the binomial theorem, Abel stated and 


proved the theorem: Let vp + v16 + v267 +--+. bea convergent series, in which 
V0, V1, U2,... are continuous functions of a variable quantity x between x = a and 
x = b; then the series f(x) = ugp+vja+ v2a*+---, where a < 4, will be convergent 


and a continuous function of x between the same limits. As in the proof of the previous 
theorem, Abel set 


vot vpa tees fume” | = W(x) and val + um4a"t! +--+. = (x). 
Then 
u(x) = (=)" Um 5” + (s)"" Um dt} + yr" SN as ae 
By the summation by parts lemma, if 9(x) denoted the largest of the quantities 
Dao” Vo” Oyo yy 8 Uae gaa e eis 


then w(x) < (s)” 6(x). Thus, for m large enough, (x) = w and f(x) = d(x) +a, 
where w was less than any assignable quantity. Similarly, 


f(x) — f@ — B) = o(*) —b(« — B) +o. 


Since (x) was a finite sum of continuous functions, it was also continuous and hence 
o(x) — d(x — B) = o. Therefore, f(x) — f(x — B) = a, which meant that f(x) 
was continuous. It was here that Abel pointed out in a footnote that Cauchy’s theorem 
on an infinite sum of continuous functions had some exceptions. But Abel succeeded 
in filling the gap, so that the proof of the binomial theorem for real exponents was 
complete. 

Abel went on?’ to prove the binomial theorem for a complex variable x and 
complex exponent. He finally stated his result as 


57 ibid. p. 125. 


4.5 Abel’s Theorem on Continuity 99 


14 idea bi) 4 CE DUNE en, | bi 
1 1-2 
| (m+ ni)(m —1+ni)(m —-24 Lae | bi) ae 
1-2-3 
| EEE SO ESE Ebi )H po. 
ee ee 
b 1 9 2 
= | cos marctan —— + 57 In[(@ +a) + b*| 


B.A 
eat zn inf +a?+ 6) 


x [d +a) oh b2] 2 enarctan rea 


+ isin (m arctan 


Note that Abel wrote ./—1 for i and log for In; the right-hand side was the principal 
value of (1 + a+ bi)™t™, 

Liouville found Abel’s proof of the continuity theorem difficult to understand and 
asked Dirichlet to explain it. Dirichlet presented a proof on the spot; Liouville then 
used it in his lectures at the Collége de France. After Dirichlet’s death, Liouville 
published the proof in honor of his friend.>® He stated and proved the theorem: 

If the series ag + aj + a2 +--- converges to A, then 


lee) 
lim np" = A. 
ae me 


Let 6, = a9 + a, +-+-+d, and0 < p < 1. Then 


S=ay+apt an p* t++++a,p" 
= 89 + (61 — 80) p + (2 — 81) p* + +++ + (On — bn—-1)p" + 
= (1 — p)(8o + 8p + Sop? + +++ + 8np" +--+). (4.27) 


Note that the last equation, (4.27), is valid because the first n + 1 terms of the two 
series differ by 6, p"*!, and this tends to zero as n —> oo. Next, break up the last series 
into two parts: 


S(p) = (1 = p)(60 + 81p +++ + 8n—1p"') + A = pnp" + Snip"! +--+). 
Let P, be a number between the maximum and minimum values of the sequence 


ns On 41s 5n42; tae 


such that the second series is equal to 


(1 — p)Pa(p" + p"*! + p™? +...) = Pap”. 


58 Dirichlet (1863). 


100 The Binomial Theorem 


Clearly, P, — Aasn — oo. Soif we let p — 1 and then letn — ow, the finite 
series tends to zero and the other series tends to A and the theorem is proved. 


4.6 Harkness and Morley’s Proof of the Binomial Theorem 


Weierstrass promulgated his fundamental ideas primarily through his teaching. Thus, 
it was left to others to write up and disseminate these ideas. For example, in 1898, 
J. Harkness of Bryn Mawr and F. Morley of Haverford College wrote Introduction 
to the Theory of Analytic Functions. They explained:>? “we recognized that readers 
approaching the subject for the first time could not fail to be hampered by the 
non-existence in English of any text-book giving a consecutive and elementary 
account of the fundamental concepts and processes employed in the theory of func- 
tions.” In his delightful article, A Mathematical Education, the great English analyst 
J. E. Littlewood mentioned that his study of Harkness and Morley’s book was one of 
the brighter spots in his education up to the time he took his Tripos examination in 
1905. 

Harkness and Morley’s proof of the binomial theorem is different from the other 
proofs presented here, and it considers the general case where the variable and 
exponent are both complex numbers. The proof depends on a theorem attributed to 
Weierstrass:°! Let ug, q=0, 1, 2,... be series in powers of x: 


2 
Ug = Ag0 + agi X + g2X" + +++ + agnxX +::- 


Given that the separate series u, and the collective series }} uz converge within the 
circle (R) and that the series }’ ug converges uniformly along every circle (R,) where 
R, < R, then within the circle (R) we have 


CO (oe) 
) a= ) Anx", 
q=0 n=0 


where a, is the sum of the coefficients of x” in the series of us. Now consider the 
function (1 — a)~* = exp(—x log(1 — a)), where a and x are complex with |a| < 1 
and where log takes its principal value.” Then 


2 3 
ea ar 
(1 —a) Shop ag hae o 
where 
xa2— xa? 
u = —xlog(l —a) = xa+ ——+—- t+. 


59 Harkness and Morley (1898) p. v. 

60 Littlewood (1986) pp. 80-93, especially p. 82. 
61 ibid. p. 134. 

62 ibid. p. 169. 
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It is clear that the series in u is absolutely and uniformly convergent in every circle 
|u| < R, while the series for “ in powers of a is absolutely and uniformly convergent 
in |a| < 1 — 6 for every 6 > 0. Therefore, by Weierstrass’s theorem, 


for |a| < 1. It remains to find x,. Note that only u, u2,...,u" contribute to the 
expression. So 


Xp Hx +ee-+(n—1)!x 


is a polynomial of degree n in x. When x = 0, — 1, —2,..., —n +1, we have 
(1 —a)~* = (1 — a)” where m = 0,1, 2,...,n — 1. For these values of m, the 
coefficient of a” in (1 — a)” is zero. Hence, x, = 0 for x = 0, —1, —2,.. 
and we have 


.,—-nt+l, 


Xn = x(x + 1)(*x4+2)---a+n-1) 


and thus 


re er eee 


n! 
n=0 


4.7. Exercises 


(1) Following Newton, apply the procedure for finding the square root of a 
number to the algebraic expression 1 — x* and show that you get the series 


1 1 1 
2 aa 6 


1 x x 
2 8 16 


(2) Apply the Gregory—Newton difference formula to the function f(@) = (1+ 
x)% and show that you get the binomial series. 

(3) Prove that the Cauchy product of the series )°*° 9 ce with itself diverges. 
Cauchy gave this example in his Analyse algébrique, chapter 6. 

(4) Cauchy stated the ratio test in chapter 6 of his Analyse algébrique: If for n 
increasing and positive, the ratio Ee converges to a fixed limit k, then the 
series up + uy + u2+--- converges when k < 1 and diverges when k > 1. 
Prove this theorem. 

(5) Cauchy’s Analyse algébrique, chapter 6, also gave the condensation test: A 
series of positive and decreasing terms ug + uj + u2 +--- converges if and 
only if ug + 2u; t dus 4 + 8u7 +--+ converges. Prove this theorem, and use it 
to prove that }°°° a converges for 4 > | and diverges for w < 1. Cauchy 
gave this application. Earlier, the fourteenth-century French theologian and 
scientific thinker N. Oresme used the condensation test in the case uw = 1. 
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(6) Prove Abel’s theorem on products, that if cy = agpby + aybyn-1 + +--+ anbo, 
and A = Yo yan, B= rg bn, C = ry cn are all convergent, then 
AB = C. See Abel (1965) vol. 1, p. 226. 

(7) Prove F. Mertens’s extension of Cauchy’s product theorem: If A is abso- 
lutely convergent and B conditionally convergent, then AB = C. See 
Mertens (1875). 

(8) Prove that if A and B are convergent, and |a,| < k lby| < k for all n, then 
C is convergent. This theorem is due to G. H. Hardy; see Hardy (1966-1979) 
vol. 6, pp. 414-416. 

(9) Follow Abel in proving that }°7°., rt diverges. Use In(1 + x) < x to show 
that that InIn(l +n) < InInn+ a Conclude that InIn(1 +7) < InIn2+ 
et Png: See Abel (1965) vol. 1, pp. 399-400. 

(10) Show that if A, = a, + ado +--+ + ay, |An| is bounded for all n, and 
ee |be41 — bx| is bounded for all n, and b, — O as n — on, then 
yr anbn converges. Dedekind stated this theorem in Supplement IX to 
Dirichlet’s lectures on number theory. See Dirichlet and Dedekind (1999) 
pp. 261-264. 

(11) (a) Suppose ¢ is a real valued continuous function and (x + y) = d(x) + 

(y). Show that (x) = ax for some constant a. 

(b) Suppose ¢ is areal valued continuous function and ¢(x+ y) = $(x)@Q). 
Show that ¢(x) = A* for some positive constant A. 

(c) Suppose ¢ is a real valued continuous function and for x > 0, y > 0, 
d(xy) = d(x) + d(y). Show that d(x) = aln x. 

(d) Suppose ¢ is a real valued continuous function and for x > 0, y > 0, 
d(xy) = o(x)¢(y). Show that (x) = x® for x > 0. 

(e) Suppose ¢ is a real valued continuous function and @(y + x) + d(y — 
x) = 26(x)¢()). Show that if 0 < @(x) < 1 and @ is not constant, 
then ¢(x) = cos(ax) where a is a constant. If d(x) > 1, then there 
exists a positive constant A such that (x) = A’. The solutions of these 
five problems take up all of chapter 5 of Cauchy’s Analyse algébrique of 
1821. 


(12) Let 


x x 


x 

Pig aaa 
Show that d(x + y) = #(x)- @(y) and hence that ¢(x) = (#(1))* = e*. See 
Cauchy (1989) pp. 168-169. 


(13) Prove the binomial theorem: Let 


falx) = _—— -(a+k-—1) F 


k=0 


Show that © f(x) = ofu+i(x) and fs1(x) — fa (x) = xfu+1(x). Deduce 
that fa(x) = (1 — x)~®. Gauss did not explicitly give this proof, but the 
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first two steps are very special cases of results in his paper on hypergeometric 
functions. 

(14) With the development of set theory and the language of sets by Cantor, 
Dedekind and others in the period 1870-1900, mathematicians could ask 
whether the equation ¢(x + y) = (x) + ¢(y) had solutions other than 
f(x) = ax, a a constant, and, if so, whether a condition weaker than 
continuity would imply ¢(x) = ax. Hilbert’s student Georg Hamel (1877— 
1954) used Zermelo’s result, that the set of real numbers can be well-ordered, 
to obtain a basis B for the vector space of real numbers over the field of 
rational numbers. Thus, B has the property that for every real number x, 


xX = rja; + rod. +--+:+rpa, for some n, rationals rj},...,r, and basis 
elements a1, ...,@,. Use a Hamel basis to show that ¢(x+ y) = (x) + ¢()) 
has solutions different from @(x) = ax, where a is a constant. See G. 
Hamel (1905). 


(15) Prove that if @ is measurable and satisfies (x + y) = (x) + d(y), then 
(x) = ax. This was proved by M. Fréchet in 1913. For a simple proof, see 
Kac (1979) pp. 64-65. 


4.8 Notes on the Literature 


In 1712, Newton cited his two letters to Leibniz and the letters of James Gregory to 
Collins to establish his priority over Leibniz in the invention of the calculus. Whiteside 
discusses this at length in volume 8 of Newton’s works, pp. 469-632. In the context of 
the calculus controversy, it was argued by Newton that Gregory did his work on series 
after seeing Newton’s series on the area of a zone of a circle. Consequently, Gregory’s 
work was perhaps relegated to a secondary position, though his discoveries were 
independent. The 1939 publication of the Gregory memorial volume has helped to 
reestablish Gregory’s position as one of the greatest mathematicians of the seventeenth 
century. 

Landen’s 43-page booklet was a contribution to the mathematical tendency of that 
time, to employ algebra to avoid infinitesimals and fluxions. Also part of this tradition, 
Hutton (1812) discusses the binomial theorem, expressing appreciation for Landen’s 
proof. Hutton’s three-volume work is entertaining reading, with articles on building 
bridges, experiments with gunpowder, histories of trigonometric and logarithmic 
tables, and a long history of algebra. 

Cauchy started teaching analysis at the Ecole Polytechique in 1817. He divided his 
course into two parts, the first dealing with infinite series of real and complex variables 
and the second with differential and integral calculus. Following eighteenth-century 
usage, he called the first part algebraic analysis. These lectures were published in 
1821 with the title Analyse algébrique. Bradley and Sandifer (2009) present an English 
translation with useful notes. This was the first textbook dealing fairly rigorously with 
the basic concepts of infinite series: limits, convergence, and continuity. 

Abel’s paper on the binomial theorem appeared in the first issue of A. L. Crelle’s 
Journal fiir die reine und angewandte Mathematik in 1826. It is the oldest 
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mathematical journal still being published today. A large majority of Abel’s papers 
were published in this journal, helping this journal quickly secure a high standing 
in the mathematics community. There are two editions of Abel’s collected papers. 
Abel’s college teacher B. M. Holmboe (1795-1850) published the first in 1837. 
Unfortunately, Holmboe could not include Abel’s great paper, “Mémoire pour une 
propriété générale d’une classe trés-étendue des fonctions transcendantes,” presented 
by Abel to the French Academy in 1826. This manuscript was lost and found several 
times before it was finally published in Paris in 1841. The manuscript was again 
lost, possibly stolen by G. Libri; in 1952 a portion was recovered in Florence by 
the Norwegian mathematician V. Brun. In 2000, Andrea Del Cantina discovered the 
remaining parts, with the exception of four pages. Abel’s “Mémoire” was included in 
the second and larger edition of his collected work, edited by L. Sylow and S. Lie, 
published in 1881. Abel (1965) is a reprint of this edition. For an English translation 
of Abel’s papers on analysis, see Abel (2007); see the footnote on p. 111 concerning 
Abel’s example of a series of continuous functions converging to a discontinuous 
function. 


e) 


The Rectification of Curves 


5.1 Preliminary Remarks 


Up until the seventeenth century, geometry was pursued along the Greek model. Thus, 
second-order algebraic curves were studied as conic sections, though higher-order 
curves were also considered. Algebraic relationships among geometric quantities were 
considered, but algebraic equations were not used to describe geometric objects. In the 
course of his attempts during the late 1620s to recreate the lost work of Apollonius, it 
occurred to Fermat that geometry could be studied analytically by expressing curves in 
terms of algebraic equations. Conic sections are defined by second-degree equations 
in two variables, but this new perspective expanded geometry to include curves of 
any degree. Fermat’s work in algebraic geometry was not published in his lifetime, 
so its influence was not great. But during the 1620s, René Descartes (1596-1650) 
developed his conception of algebraic geometry and his seminal work, La Géométrie, 
was published in 1637. Note that Fermat’s work on algebraic geometry, Ad locos 
planos et solidos isagoge, written in Viéte’s notation, was published in 1679, though he 
most probably wrote it before Descartes’s 1637 book.! The variety of new curves made 
possible by this new perspective, combined with the development of the differential 
method, spurred the efforts to discover a general method for determining the length 
of an arc. In the late 1650s, Hendrik van Heuraet (1634—c. 1660) and William Neil(e) 
(1637-1670) gave a solution to this problem by reducing it to the problem of finding 
the area under a related curve. In this and other areas, Descartes’s new approach to 
geometry served as a guiding backdrop. 

Descartes’s early training in mathematics included the study of the classical texts of 
the fourth-century Greek mathematician Pappus, the Arithmetica of Diophantus, and 
the contemporary algebra of Peter Roth and Christoph Clavius. Descartes’s meeting 
with Johann Faulhaber in the winter 1619-20 also contributed to his understanding 
of algebra. However, from the very beginning, Descartes was determined to develop 
and follow his own methods, and this eventually led him to a symbolic algebra whose 
notation was very similar to the one we now use. 


! Struik (1987) p. 99. 
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In the first part of his Géométrie, Descartes explained how geometric curves could 
be reduced to polynomial equations in two variables. He did not consistently use 
what are now called the Cartesian orthogonal axes but chose the angle of his axes 
to suit the problem. On the subject of the rectification of curves, Descartes wrote, 
“geometry should not include lines that are like strings, in that they are sometimes 
straight and sometimes curved, since the ratios between straight and curved lines 
are not known, and I believe cannot be discovered by human minds.”* But in a 
November 1638 letter to Mersenne, Descartes nevertheless discussed the rectification 
of the logarithmic spiral, a curve he defined as making a constant angle with the 
radius vector at each point. To understand this apparent contradiction in his thinking, 
note that Descartes made a distinction between geometric and mechanical curves. 
Geometric curves were defined by algebraic equations; mechanical curves would 
today be termed transcendental. So when Descartes referred to unrectifiable curves, 
he meant the geometric ones. He maintained that the study of geometry should be 
restricted to algebraic curves, for which algebraic methods should be used. Thus, for 
constructing tangents and normals to curves, he used algebraic methods, as opposed 
to the infinitesimal methods of Fermat. Descartes considered Fermat’s methods 
inappropriate for geometric (for us, algebraic) curves. And, since the length of an 
arc could not be found by algebraic methods, he stated that the length of geometric 
curves could not be obtained.* He allowed, however, that the lengths of mechanical 
(transcendental) curves could be determined by infinitesimal methods. In fact, in 1638 
Descartes succeeded in rectifying the equiangular (or logarithmic) spiral,* and he was 
therefore one of the earliest mathematicians to find the length of a noncircular arc. 
Soon after this, Torricelli also rectified this spiral. We note that by approximately 
1614, Harriot had completed his work on this spiral, in connection with his researches 
related to navigation, but he did not publish his results. 

Frans van Schooten (1615-1660) played an important role in the solution of the 
rectification problem. A Dutch mathematician of considerable ability, he was certainly 
one of the great teachers of mathematics. He attracted a number of talented young 
students to mathematics, even when their primary interests lay in other disciplines. Van 
Schooten studied at the University of Leiden where his father, Frans van Schooten the 
Elder, was a professor of mathematics. The younger van Schooten received a thorough 
grounding in the Dutch mathematical tradition. In 1635, he met Descartes, and by the 
summer of 1637 he had seen his Géométrie, though he did not immediately understand 
it. In 1646 he inherited his father’s professorship and in 1647, van Schooten published 
a Latin translation of Descartes’s book with his own commentary. This translation 
made Descartes’s ideas accessible to many more mathematicians and simultaneously 
helped build van Schooten’s reputation. A second edition of this translation appeared 
in 1659. The work was about a thousand pages long, ten times longer than Géométrie; 


2 Descartes, Smith, and Latham (1954) p. 91. 
3 Hofmann (1990) vol. 2, p. 132. 

4 Hofmann (1974) p. 103. 

5 Pepper (1968). 
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it included several significant contributions by van Schooten’s students: Christiaan 
Huygens, Jan Hudde, Jan de Witt, and Hendrik van Heuraet.® 

Van Heuraet entered the University of Leiden as a medical student. He was inspired 
to study the rectification problem by Huygens’s 1657 discovery that the arc length 
of a parabola could be measured by the quadrature of an equilateral hyperbola. In 
modern terms, this means that the arc length of y = x* can be computed by the 


integral f (1 + 4x?) 3 dx. Sometime in 1658, van Heuraet solved the general problem; 
he communicated his work to van Schooten in a letter dated January 13, 1659.’ In the 
course of applying his method to the semicubical parabola y> = ax?, he used a rule 
of Hudde concerning multiple roots of polynomials. 

Jan Hudde (1628-1704) studied law at Leiden around 1648 and later served as 
burgomaster of Amsterdam for 21 years. He stated his rule in the article De Maximis 
et Minimis, communicated to van Schooten in a letter of February 26, 1658.8 His 
rule provided a method for determining maxima and minima of functions and a 
simplification of Descartes’s method for finding the normal to an algebraic curve. 
In unpublished work from 1656,? Hudde used the logarithmic series in the form 
x+ = + = +--+. Note that N. Mercator published this series in 1668!° and Newton’s 
unpublished results on this topic date from 1665. 

Around the same time as van Heuraet, the English mathematician William Neil 
gave a method for rectifying the semicubical parabola; this method could also be 
generalized to other curves. Wallis included Neil’s work in his Tractatus duo of 
1659.'! The methods of Neil and van Heuraet were lacking in rigor, but Pierre Fermat 
very soon filled the gap. He showed in his Comparatio Curvarum Linearum of 1660 
that a monotonically increasing curve will have a length. James Gregory, apparently 
independently of Fermat, also found a rigorous proof of this fact, technically better 
than Fermat’s, using the same basic idea. Gregory’s proof appeared in his Geometriae 
Pars Universalis of 1668. The inspiration behind both Fermat and Gregory was 
Christopher Wren’s rectification of the cycloid in 1659. 


5.2 Descartes’s Method of Finding the Normal 


Descartes’s conception of geometry may be summarized by his own remarks from his 
Geometry, given just before he described his method of finding the normal: !? 


Finally, all other properties of curves depend only on the angles which these curves make with 
other lines. But the angle formed by two intersecting curves can be as easily measured as the 
angle between two straight lines, provided that a straight line can be drawn making right angles 
with one of these curves at its point of intersection with the other. This is my reason for believing 


6 See Bissel (1987) for more details on the contributions of these Dutch mathematicians. 
7 van Schooten (1659) vol. I, pp. 517-520. 

8 ibid. pp. 507-516. 

9 Hofmann (2003) p. 39. Leibniz (1920) p. 123. 
10 Mercator (1668). 
!l Wallis (1659) pp. 90-94. 
12 Descartes, Smith, and Latham (1954) p. 95. 
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Figure 5.1 Descartes’s construction of a normal. 


that I shall have given here a sufficient introduction to the study of curves when I have given a 
general method of drawing a straight line making right angles with a curve at an arbitrarily chosen 
point upon it. And I dare say that this is not only the most useful and most general problem in 
geometry that I know, but even that I ever have desired to know. 


In Figure 5.1, as given by Descartes, suppose y = f(x) is the curve ACQ and 
C is a point on the curve. Also suppose that CF is the tangent at C that intersects 
the x-axis at F. Let CP be the normal at C, meaning that FC P is a ninety-degree 
angle, with P a point on the x-axis. Then FM and PM are called, respectively, the 
subtangent and the subnormal at C. Observe that if the lengths of the subtangent and 
subnormal are known, then it is easy to draw or construct the tangent and the normal 
to the curve at C. Descartes gave a method for constructing the normal at a point on 
acurve y = f(x) where ( f (x)) was a polynomial. He assumed that the center P of 
the circle tangent to the curve at C had coordinates (v,0). The equation of the circle 
of radius s was thus 


yt(x—v)* =”. (5.1) 


In general, a circle would cut the curve at C and at another point. These two points 
would coalesce into one point when the circle was tangent to the curve. So Descartes 
noted that when the variable y was eliminated from the two equations (5.1) and y = 


(f (x))’, then the polynomial 


2 
y=(f@) + @- 0)? -s? 
would have a factor (x — e)? for some e, or (with g(x) some polynomial) 


(f(x)? + ( — v)? — 5? = (x —e)* g(x). 


By equating the coefficients of the powers of x, he was able to find the value of e in 
terms of v and thus the value of the subnormal v — e. As his words indicate, Descartes 
felt that this method solved the fundamental problem of geometry. 

Let CA in Figure 5.1 be an algebraic curve. Note that in the original book, Descartes 
interchanged x and y. But we will suppose CP is normal to the curve at C and let 
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AM = x and CM = y. The problem is to find v = AP. Descartes took the x-axis 
such that the center P of the required circle fell on the axis. The equation of the circle 
was (x — v)? + y? = s*, where CP = s was the radius. He used this equation to 
eliminate y from the equation of the curve, obtaining an expression in x with a double 
root when the circle was tangent to the curve. He could then write this expression such 
that it had a factor (x — e)”. Descartes then found the required result by equating the 
powers of x. He explained his method by applying it to some examples such as the 
ellipse 2x? —rx +? = 0, in which case he used the equation of the circle to eliminate 
y and obtain the equation 


r—2qv v2 — 52 
go GES UN Msi 


GF Gat 


He set the left-hand side as (x — e)* and equated coefficients of powers of x to find 
the necessary result: 


q(r — 2v) r oq-r roq-r 
= ,orv=54 e=-~4 x 


2e 
Gat q 2 q 


Claude Rabuel’s 1730 commentary!? on Descartes’s Géométrie gave the example 
of the parabola y* = rx. In this case, 


x? +(r—2v)y+v?—s*=0 


must have a double root. The resulting equations, when the left-hand side is set equal 
to (x — e)?, are 


r—2v=-—2e, vi —s* =e’. 


So v = 5 +e, and this implies that v = 5 + x, since x =e. 

It is easy to see that Descartes’s method would become cumbersome for curves of 
higher degree. After equating coefficients, one would end up with a large number of 
equations. It is for this reason that Hudde searched for a simpler approach. 


5.3. Hudde’s Rule for a Double Root 


In his letter of February 1658 to van Schooten, Hudde gave a rule to determine 
conditions for a polynomial to have a double root:!+ 


If in an equation two roots are equal and this is multiplied by an arbitrary arithmetical progression, 
naturally the first term of the equation by the first term of the progression, the second term of the 
equation by the second term of the progression and so on: I say that the product will be an equation 
in which one of the afore-mentioned roots will be found. 


13, Rabuel (1730) p. 314. 
14 Grootendorst and van Maanen (1982) p. 107. 
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Hudde gave a proof for a fifth-degree polynomial,!> and it works in general. In 
modern notation, suppose b is a double root of f(x) = 0. Then 


F(x) = (« — b)* (cot crx + cnx? +++ + en_2x"-?) 
n—2 
= Yau _ Qbxkt! 4 b*x*), 
k=0 


If this equation is multiplied term-wise by an arithmetic progression p + gk where 


p and q are integers and k = 0, 1,...,m — 2, then we arrive at the polynomial 
n—2 
g(x) = Do ce((p +a (k +. 2))x*¥? — 2b((p + gk + 1))x**! + b?(p + gk)x*)) 
k=0 
n—2 n—2 
= ce(p + gk)x* (x? —2bx + b*) + ws cx2q (xkt? — bx*t1) 
k=0 k=0 
n—2 n—2 
= (1 = bY Sp t+ ak)x* + 2g (x — b) YT enh", 
k=0 k=0 


Clearly, x = b is a root of g(x). For a modern proof of Hudde’s rule, observe 
that g(x) = pf(x) + qxf’(x), where f’(x) is the derivative of f(x). By writing 
f(x) = (x — a)*h(x), we get f’(x) = 2(x — a)h(x) + (x — a)*h' (x). Thus, if x =a 
is a double root of f(x), then x = a is also a root of f’(x) and conversely. 

Hudde’s rule greatly reduced the computation required for Descartes’s method, 
especially if the arithmetic progression was chosen judiciously. Van Schooten’s book 
gave several examples of the application of Hudde’s rule, and van Heuraet used it in 
his work on rectification. 


5.4 Van Heuraet’s Letter on Rectification 


We present van Heuraet’s 1659 rectification method largely in his own terms with 
some modification.'® Concerning Figure 5.2, van Heuraet wrote, in the classical style, 
“CM is to CQ as & to MI,” where & denoted a fixed line segment. We describe 
this relationship as ce = M1, eliminating the reference to &. Unlike van Heuraet, 
we now view & as a number and set it equal to 1. Van Heuraet set out to find the 
length of the curve ACE. CQ was normal to the curve, and CN was the tangent at C. 
The point J was determined so that ce = MI, where MI was perpendicular to 
AQ. The locus of the point / then determined the curve G/L, and the area under this 
curve rectified AC E. Once again, recall van Heuraet’s perspective: He wrote that the 
area under the curve G/L equaled the area of the rectangle with one side as © and 


15. ibid. pp. 107-108. 
16 For the original letter and English translation, see ibid. pp. 95-105. 
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Figure 5.2 Van Heuraet’s diagram. 


the other side equal to the length of the curve ACE. To prove this, he observed that 
the similarity of the triangles ST X and CMQ gave 


ST C MI 
Co _ vt. (-~). or ST=MI-SX. 


Thus, the length of ST was the area of the rectangle of base SX and height MJ. The 
lengths of the tangents taken at successive points along AE approximated the length 
of the curve AC E; when the number of points was increased to infinity, the length of 
the curve equaled the area under G/L. 

Van Heuraet then explained how the result might be applied to the semicubical 


parabola y= = where AM = x and MC = y. He let AQ = s, CQ = v. Then 


s* —Isx +274 . =v’. (5.2) 
a 


Following Descartes, van Heuraet noted that there were two equal roots of the equation 
implied by the simultaneous equations y? = = and (s — x)* + y* = v*. So he 
multiplied equation (5.2), according to Hudde’s method, by 0, 1, 2, 3, 0 to get 

ae 3x? 


Isx + 2x? 4 =0 or s=x+——. 
a 2a 


Thus 


C yh 
Wie = esl, 
CM y 


and the area under the curve G/L could be expressed as 


3 


8a 9 2 8a 
14 x : 
2] 4a 27 
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5 
Van Heuraet then pointed out that the lengths of the curves defined by y+ = xa, y® = 
9 
xz, ye = xa, and so on to infinity, could be found in a similar way. However, in 
2, 
the case of a parabola y = *-, one had to compute the area under the hyperbola y = 


V 4x2 + a*. He concluded, “From this exactly we learn that the length of the parabolic 
curve cannot be found unless at the same time the quadrature of the hyperbola is found 
and vice versa.” 


5.5 Newton’s Rectification of a Curve 


Isaac Newton carefully studied van Schooten’s book, so he understood van Heuraet’s 
method. Newton worked out a simpler method of rectification, based on his conception 
of a curve as a dynamic entity, or as a moving point. In his 1671 treatise, Of the 
Method of Fluxions and Infinite Series,!’ Newton treated arc length by the approach 
he developed in his 1666 tract on calculus (or fluxions). Referring to Figure 5.3, he 
explained his derivation in the text presented by the 1737 editor, modified to include 
Newton’s later “dot” notation: 


The Fluxion of the Length is discovered by putting it equal to the square root of the sum of 
the squares of the Fluxion of the Absciss and of the Ordinate. For let RN be the perpendicular 
Ordinate moving upon the Absciss MN. And let QR be the proposed Curve, at which RN is 
terminated. Then calling MN = s, NR = t, and QR = v, and their Fluxions 5s, f, and 0, 
respectively; conceive the line NR to move into the place nr infinitely near the former, and 
letting fall Rs perpendicularly to nr; then Rs, sr and Rr, will be contemporaneous moments of 
the lines MN, NR, and QR, by the accession of which they become Mn, nr, and Qr; but as 
these are to each other as the Fluxions of the same lines, and because of the Rectangle Rsr, it will 


be V Rs’ +57? = Rr, or J82 +22 = 9. 


Later in the treatise, he added that one may take s = 1. This gives exactly the formula 
we have in textbooks now. It is interesting that some of his examples still appear in 


. : 3 . 
modern textbooks. For example, Newton considered the equation y = 4 + 1 with 
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Figure 5.3 Newton’s rectification of a curve. 


17 Newton (1964-1967) vol. I, pp. 173-174. 
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a aconstant. Taking z = 1, he had y = 22 — 4 and \/1 + y? = 2 + 4. Thus, 


: aa 12zz aa 12zz° 
the arc length was given by < — ore Newton also went on to find the constant of 
integration. 


5.6 Leibniz’s Derivation of the Arc Length 


In 1673, Leibniz became interested in problems related to arc length;'® following 
Pascal, he considered the characteristic triangle of a curve, as shown in Figure 5.4. 
Note that the characteristic triangle has sides of lengths dx, dy, and ds. After reading 
van Heurat’s work on arc length, Leibniz attempted to find the arc length of a parabola 
by means of an infinite series. In a later, undated manuscript, probably from around 
1680,!° he noted the length of the infinitesimal arc as /dx2 + dy”. Thus, 


ds = ,/dx? + dy?. 


By 1680, Leibniz had developed his notation and his ideas on integration so that he 


could write 
d 2 
s= f Janr+ay =f fi+(2) dx. 


5.7. Exercises 


(1) Find the lengths, between two arbitrary points, of curves defined by the 


3 
2a? +27)? , 2 3. 4 5. 
3a2 ) b) ay = z ’ ay — x ’ 


gntl, qy2n—-l = 2, y = (gq? + bz2)2. See Newton (1964—1967) vol. 1, 


pp. 181-187. Note that in y = (a + bz?)2, Newton expressed the arc length 
as an infinite series. 


2n 


equations: y = ay® = 27; ay* = 


d: 
§ a 


Figure 5.4 Leibniz on arc length. 


18 See Probst (2015) p. 118. 
19 Leibniz (1920) pp. 139-141. 
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(2) Find the length of any part of the equiangular or logarithmic spiral. Note that 


(3 


(4 


(5 


(6 


wm 


) 


wm 


wm 


P cote where a and @ are 


this spiral in polar coordinates is defined by r = ae 
constants. 

Find the length of any arc of the Archimedean spiral defined by r = aé. 
Stone (1730) worked out the examples in Exercises 2 and 3. Edmund Stone 
(1700-1768) was the son of the gardener of the Duke of Argyll. He taught 
himself mathematics, Latin, and French and translated mathematical works 
from these languages into English. Elected to the Royal Society in 1725, Stone 
was financially supported by the Duke whose death in 1743 left him destitute. 
See Pierpoint (1997). 

Take a triangle ABC whose sides AB = x, AC = a, BC = = and the 


perpendicular (altitude) CD = a form a geometric progression. Show that 


1 V5 
x=a 5 + a 
See Newton (1964-1967) p. 63. This exercise is taken from Newton’s book on 
algebra, illustrating that algebra could be used to solve geometric problems. 
Viete had earlier used algebra in this way. Note, however, that this use of 
algebra is different from the algebraic geometry of Descartes and Fermat, in 
which algebraic equations are used from the outset to define curves. 


In a triangle ABC, let AC = a, BC = b, AB = x. Let CD = c bisect the 


angle at C. Show that x = (a + b),/ apace See Simpson (1800) p. 261. This 
is an example of algebra being used in the service of geometrically defined 
problems. 

Suppose ABC is a triangle such that the length of the bisectors of the angles B 
and C are equal. Use the result of the previous exercise to prove that the triangle 
is isosceles. In 1840, this theorem, now known as the Lehmus-Steiner theorem 
or the internal bisectors problem, was suggested as a problem by D. C. Lehmus 
(1780-1863) to the great Swiss geometer, Jacob Steiner (1796-1867). As a 
high school student during the 1930s, A. K. Mustafy rediscovered Simpson’s 
result and applied it to solve this problem. A. S. Mittal has pointed out to me 
that if the trisectors or n-sectors are equal, the triangle must be isosceles; the 
reader may wish to prove this also. 


5.8 Notes on the Literature 


Descartes (1954), translated by Smith and Latham, gives both an English translation 
and the original French of Descartes’s book on geometry. Rabuel’s 1730 commen- 
tary on Descartes’s Geometrie is very helpful because presents many examples of 
Descartes’s method. Edmund Stone, mentioned in Exercise 3, wrote a 1730 book 
on calculus, with two parts: a translation of G. l’H6pital’s 1696 differential calculus 
text and a treatment of the integral calculus. Guicciardini’s (1989) discussion of other 
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eighteenth-century British calculus textbooks provides much interesting information 
on those books and their authors. For example, Guicciardini suggests that Bishop 
(George) Berkeley may have used Stone’s presentation as mathematical background 
for his 1734 work, containing his philosophical objection to the concept of the 
infinitesimal. Pierpoint (1997) gives Stone’s dates as 1700-1768, as opposed to others 
who give 1695 as his date of birth. 


6 


Inequalities 


6.1 Preliminary Remarks 


In his 1928 presidential address to the London Mathematical Society,! G. H. Hardy 
observed, “A thorough mastery of elementary inequalities is to-day one of the first 
necessary qualifications for research in the theory of functions.” He also recalled, 
“T think that it was Harald Bohr who remarked to me that ‘all analysts spend half their 
time hunting through the literature for inequalities which they want to use and cannot 
prove.’ ” It is surprising, however, that the history of one of the most basic inequalities, 
the arithmetic and geometric means inequality (AMGM), is tied up with the theory of 
algebraic equations. Inequalities connected with the symmetric functions of the roots 
of an equation were used to determine the number of that equation’s positive and 
negative roots. In 1665-1666, Newton laid down the foundation in this area when, 
in order to determine the bounds on the number of positive and negative roots of 
equations, he stated a far-reaching generalization of Descartes’s rule of signs. 

The arithmetic and geometric means inequality states that if there are n nonnegative 
numbers a1,a2,...,@, and there are n positive numbers qi,q2,...,@n, such that 


ya = lathes 
n 
ajay’ an" < Yo qidi, (6.1) 
i=l 


where equality holds only when all the a; are equal. The theory of equations has an 
interesting connection with the case for which q; = a 


(ayay-++ayn)" <> (6.2) 


. n 
i=1 


! Hardy (1929). 
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Note that when n = 2, the AMGM is simply another form of 
(ai — a2)” > 0; (6.3) 


this case can probably be attributed to Euclid. The nature of the relationship between 
the AMGM and algebraic equations is clear from (6.2). To see this, suppose that 
a|,a2,...,Qy are the roots of 


x” — Ayx" 14 Agx™ 2 4.2.4 (-1)" A, = 0. 


1 
Then (6.2) is identical with the inequality Aj) < AL 
The three-dimensional case of the AMGM was first stated and proved by Thomas 
Harriot (c. 1560-1621) to analyze the roots of a cubic. Before this, Frangois Viéte 
(1540-1603) gave the condition under which a cubic could have distinct positive roots; 
Harriot then noted that Viéte’s condition was insufficient and that one also required 


1 
A} < A. A substantial portion of Harriot’s algebraic work arose from his attempts to 
improve on both the notation and the results of the algebraist Viéte. One of Harriot’s 
innovations was to make algebra completely symbolic. Much of his work dates from 
about 1594 to the early 1600s, but his book on algebra, Artis Analyticae Praxis, was 
published in 1631, ten years after his death. And even this book omitted significant 
portions of Harriot’s original text and, in places, changed the order of presentation so 
that the text lost its clarity. Harriot set up a new, convenient notation for inequality 
relations, and it is still in use, although Harriot’s inequality symbol was very huge. 
William Oughtred, whose work was done later, independently introduced a different 
notation for inequality. One can see this cumbersome notation in the early manuscripts 
of Newton, but it soon fell into disuse. Harriot, with his effective notation, showed that 
one could carry out algebraic operations without using explanatory sentences or words 
and he demonstrated the superiority of his notation by rewriting Viéte’s expressions. 
The Harriot scholar J. Stedall points out? some key examples of this: Where Viéte 
wrote: If to A plane there should be added a ae the sum will be G times 
ac | 2% 


B times Z squared , : : _ acgt+bzz x75 : 
A plane + ~BtimesG Harriot wrote: $4 an Viéte, under the influence 
of the Greek mathematicians, wrote A plane, meaning that A was a two-dimensional 


object; Harriot instead had ac. Then again, Viéte described his example of antithesis: 


A squared minus D plane is supposed equal to G squared minus B times A. I say that A squared 
plus B times A is equal to G squared plus D plane and that by this transposition and under 
opposite signs of conjuction the equation is not changed. 

Harriot’s streamlined notation gives us: 


Suppose aa — dc = gg — ba. I say that aa + ba = gg + dc by antithesis. 


Descartes made similar advances in notation and in some places went beyond 
Harriot. For example, Descartes wrote a? or a*, in place of aaa or aaaa, retaining 


2 Harriot and Stedall (2003) pp. 8-11. 
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aa for a*. Moreover, Descartes also stated a rule giving an upper bound for the number 
of positive (negative) roots of an equation. Its extension by Newton contributed further 
to the discovery of some important inequalities. In his 1637 La Géométrie Descartes 
stated his rule:? “We can determine also the number of true [positive] and false 
[negative] roots that any equation can have as follows: An equation can have as many 
true roots as it contains changes of sign from + to — or from — to +; and as many false 
roots as the number of times two + signs or two — signs are found in succession.” 
Thus, in Descartes’s example, x* — 4x3 — 19x? + 106x — 120 = 0, the term +x* 
followed by the term —4x3, then —19x? followed by 106x and, finally, 106x followed 
by —120 net a total of three changes of sign. According to Descartes’s recipe, there 
can therefore be three positive roots. In fact, the roots are 2, 3, 4, and —5. The negative 
root is indicated by one repeated sign: —4x> followed by —19x?. A rudimentary form 
of this rule of signs can be found in the earlier work of Faulhaber and Roth. Indeed, 
Manders has suggested* with some justification that Faulhaber and Descartes may 
have collaborated in analyzing the work of Roth. 

Since Descartes’s rule gave an upper bound for the number of positive roots and 
for the number of negative roots, it also determined a lower bound for the number 
of complex roots. Newton gave an extension of this rule, yielding a more accurate 
lower bound for many cases. This extension also directly connected this problem 
with certain inequalities satisfied by the coefficients of the given polynomial. Such 
inequalities included the AMGM. Newton’s rule involved the consideration of the 
sequence of polynomials quadratic in the coefficients of the original polynomial. 
Newton stated this rule,> called by Sylvester “Newton’s incomplete rule,” in his 
Arithmetica Universalis, in the section “Of the Nature of the Roots of an Equation.” 


But you may know almost by this rule how many roots are impossible. 


Make a series of fractions, whose denominators are numbers in this progression 1, 2, 3, 4, 5, &c. 
going on to the number which shall be the same as that of the dimensions of the equation; and the 
numerators the same series of numbers in a contrary order. Divide each of the latter fractions by 
each of the former. Place the fractions that come out over the middle terms of the equation. And 
under any of the middle terms, if its square, multiplied into the fraction standing over its head, is 
greater than the rectangle of the terms on both sides, place the sign +; but if it be less, the sign —. 
But under the first and last term place the sign +. And there will be as many impossible roots as 
there are changes in the series of the under-written signs from + to —, and — to +. 


He made the following remarks for the case in which two or more successive terms of 
the polynomial were zero:° 


Where two or more terms are wanting together, under the first of the deficient terms you must 
write the sign —, under the second sign +, under the third the sign —, and so on, always varying 
the signs, except that under the last of such deficient terms you must always place +, when the 
terms next on both sides the deficient terms have contrary signs. As in the equations 


—a=0; 


x> + ax4 x * * +a =0, and x> +ax4 « * * 
caiee ig a 


+ + -4+-+4+ + + 


3 Descartes (1954) p. 160. 

4 Manders (2006). 

5 Newton (1964-1967) vol. 2, pp. 103-105. 
© ibid. 
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the first whereof has four, and the latter two impossible roots. Thus also the equation, 


x? — 2x6 4 3x5 — 2x4 4.33 x x -3 =0 
t t t t 


has six impossible roots. 


To understand Newton’s incomplete rule, we use modern notation. Let the polyno- 
mial be 


agx” + ayx"—! + agx™* +--+ an_1x + an. 


In the sequence of fractions 7, no a ately vies divide the second term by 

: n—1 2(n—2) 3(n—3) n—-l1; 
the first, the third by the second and so on to get Tn? 3@i=1)? Fad)? So 7 1S 
placed over a, eae over a2 and so on. If Da? > aga, then a + sign is placed 


under a; and a — sign if the inequality is reversed. Similarly, if He ay > a1 a3, 


then place a + sign under a2 and so on. These inequalities take a simpler form if we 
follow J. J. Sylvester’s notation from his 1865 paper’ in which Newton’s rule was 
proved for the first time, almost two hundred years after it was discovered. Write the 
polynomial as 


1 
f(x) = pox" + np\x"! + 5a — 1) pox"? f+) +npn—1x + Pn. (6.4) 


The inequalities become Pt > Pop2 or a — Ppop2 > 9, P35 — p,p3 > 0, and so on. 
Thus, Newton’s sequence of signs is obtained from the sequence of numbers 
= 2 — pe ere 
Ao = Po, Al = Pj — Pop2, A2 = Pz — P1P3,++-s 
An-1 = es — Pn-2Pn, An = pe. 
Ag and A,, are always positive, while a plus sign is written under ax if 
Ax = pt — Pe-1 Pevi > 0 (6.5) 


and a minus sign if 


Ak = Pe — Pk-1 Pet <0. (6.6) 


Newton gave several examples of his method, including the polynomial equation 
x* — 6xx — 3x —2 =0. Here a; = 0 and the signs of Aj, A1, A2, A3, A4 came out to 
be + + + — +. So, Newton wrote that there were [at least] two “impossible roots.” 

In a 1728 paper® appearing in the Philosophical Transactions, the Scottish math- 
ematician George Campbell published an incomplete proof of the incomplete rule 
(Newton’s rule for complex roots); his efforts were sufficient to obtain the AMGM. 


7 Sylvester (1973) vol. 2, p. 498. 
8 Campbell (1728). 


120 Inequalities 


A little later, in 1729, Colin Maclaurin published a paper? in the same journal 
proving similar results. A priority dispute arose in this context, although it is generally 
recognized that Maclaurin’s work was independent. Both Campbell and Maclaurin use 
the idea that the derivative of a polynomial with only real roots also had only real roots. 
Note that, in fact, this follows from Rolle’s theorem, published in 1691,!° though 
neither Campbell nor Maclaurin referred to Rolle. Campbell wrote that the derivative 
result was well-known to algebraists “and is easily made evident by the method of 
the maxima and minima.” He began his paper by stating the condition under which 
a quadratic would have complex roots. He then showed that a general polynomial 
would have complex roots if, after repeated differentiation, it produced a quadratic 
with complex roots. Extremely little is known of Campbell’s life; he was elected to 
the Royal Society on the strength of his paper in the Transactions. 

In his 1729 paper, Maclaurin stated and proved, among other results, his 
inequalities: 

1 1 1 


Pie pS pe SS, (6.7) 


where the px; were all positive and defined by (6.4) with po = 1. Most of Maclaurin’s 
work in algebra arose out of his efforts to prove Newton’s unproven statements and he 
presented them in his Treatise of Algebra, unfortunately published only posthumously 
in 1748. 

Later, especially in the nineteenth century, the arithmetic and geometric means and 
related inequalities became objects of study on their own merits; they were then stated 
and proved independent of their use in analyzing the roots of algebraic equations. In 
the 1820s, Cauchy gave an inductive proof of AMGM in his lectures at the Ecole 
Polytechnique. He started with n = 2 and then proved it for all powers of 2. He 
then obtained the result for all positive integers by a proof containing an interesting 
trick. In 1906, Jensen discovered that Cauchy’s method could be generalized to convex 
functions, a fruitful concept he discovered and named, though it is implicitly contained 
in the work of Otto Hélder. To understand Jensen’s motivation, observe that the two- 
dimensional case of (6.1) could be written as 


ie 7 xj +x9 
el+ter>2e 2 


This led Jensen to define a convex function on an interval [a,b] as a continuous 
function satisfying 


Xp +x 
(x1) + $(x2) = 26 (a*) (6.8) 
for all pairs of numbers x1,x2 in [a,b]. Cauchy’s proof could be applied in this 


situation without any change and Jensen was able to prove that for any n numbers 
X1,%2, hes, Xn in la, bl, 


9 Maclaurin (1729). 
10 Rolle (1691). 
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(6.9) 


o(Stetn ts) — PO) + G2) + + Gn) 


n n 


As we shall see later in this chapter, Jensen did not require continuity up to this 
point, but he needed it for the generalization to (6.1). 

Johan Jensen (1859-1925) was a largely self-taught Danish mathematician. He 
studied in an engineering college where he took courses in mathematics and physics. 
To support himself, he took a job in a Copenhagen telephone company in 1881. 
His energy and intelligence soon got him a high position in the company where he 
remained for the rest of his life. The rapid early development of telephone technology 
in Denmark was mainly due to Jensen. His spare time, however, was devoted to 
the study of mathematics; the function theorist Weierstrass, also self-taught to a 
great extent, was his hero. Jensen himself made a significant contribution to the 
theory of complex analytic functions, laying the foundation for Nevanlinna’s theory 
of meromorphic functions of the 1920s. Jensen wrote his generalization of (6.1) in the 
form!! 


) (Aut) < 2 uh Xp) (6.10) 


a 


where a = ) a, and a, > 0. He took d(x) = x?,p > 1,x > 0 to obtain 
the important inequality named after Hélder, one form of which states that for 
Pp, b,,and c,, all positive, if 5 + ; = 1, then 


Y buch < (Scen)’ (Si)? (6.11) 


Interestingly, in 1888, L. J. Rogers was the first to state and prove the inequality 
(6.11);!? he first proved (6.1) and then derived several corollaries, including the Hélder 
inequality. A year later, Hélder gave the generalization (6.10),!> except that he took 
(x) to be differentiable, with #’(x) increasing. It is not difficult to prove that such 
functions are convex in Jensen’s sense. Hélder noted that his work was based on that 
of Rogers, and Jensen also credited Rogers. 

The case p = q = 2 of (6.11) is called the Cauchy—Schwarz inequality. Cauchy 
derived it!* from an identity with the form, here given for three dimensions: 


(ax + by + cz)? + (ay — bx)? + (az — ex)? + (bz — cy)? 
= (a+b? 4+0*)(x7 + y? +2”). (6.12) 


It is clear that the identity implies 


Sas < $3 «?)? (me)' (6.13) 


'l Jensen (1906). 
12 Rogers (1888). 
13, Holder (1889). 
'4 Cauchy et al. (2007) p. 304. 
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and that equality would hold if ay = bx,az = cx, and bz = cy, that is, = = =< = = It 
is evident that identity (6.12) can be extended to any number of variables. Eighteenth- 
century mathematicians such as Euler and Lagrange applied this and other identities 
involving sums of squares to physics problems, number theory and other areas. 

In 1885, Hermann Schwarz (1843-1921) gave the integral analog of the inequality 
with which his name is now associated.'> But this analog was actually presented 
as early as 1859!© by Viktor Bunyakovski (1804-1859), a Russian mathematician 
with an interest in probability theory who had studied with Cauchy in Paris. Though 
Bunyakovski made no claim to this result, it is sometimes called the Cauchy— 
Schwarz—Bunyakovski inequality. Bunyakovski was very familiar with Laplace’s 
work in probability theory, a subject in which he did his best work and for which 
he worked out a Russian terminology, introducing many terms which have became 
standard in that language. 

We briefly mention other related results, without detailed definitions. One of 
the earliest applications of the infinite form of the Cauchy—Schwarz and Hélder 
inequalities was in functional analysis, dealing with infinite series and integrals. 
For example, in a pioneering paper of 1906,!’ David Hilbert defined /? spaces 
consisting of sequences of complex numbers {a,} such that the sum of the squares of 
absolute values converged. The infinite form of the Cauchy—Schwarz inequality may 
be employed to show that an inner product can be defined on /”. In a paper of 1910,!8 
the Hungarian mathematician Frigyes Riesz (1880-1956) generalized Hilbert’s work. 
Dieudonné called this paper “second only in importance for the development of 
Functional Analysis to Hilbert’s 1906 paper.”!? Riesz kept well abreast of the work 
of Hilbert, Erhard Schmidt, Ermst Hellinger, Otto Toeplitz, Ernst Fischer, Henri 
Lebesgue, Jacques Hadamard, and Maurice Fréchet. With such inspiration, Riesz was 
able to define and develop the theory of /? and L? spaces. By using Minkowski’s 
inequality, he proved that these were vector spaces; employing the Holder inequality, 
he showed that /? and L? were duals of /? and L?, where gq = =) and p> 1.In 
a proof very different from Minkowski’s proof related to the geometry of numbers, 
Riesz demonstrated that Minkowski’s inequality could be obtained from Hélder’s. 
Thus, inequalities originating in the study of algebraic equations eventually led to 
inequalities now fundamental to analysis. 


6.2 Harriot’s Proof of the Arithmetic and Geometric Means Inequality 


Harriot proved the AMGM only in the cases of two and three dimensions, but his 
motivation, notation, and mode of presentation are worthy of note. Harriot began by 
proving the inequality for dimension 2:7° 


13 Schwarz (1885). 

16 Bunyakovski (1859). 

17 Hilbert (1906). 

18 Riesz (1910). 

19 Dieudonné (1981) p. 124. 

20 Harriot and Stedall (2003) p. 195. 
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Lemmal Suppose b, a, a are in continued proportion and suppose b > a. I say that b + a > 
2a that is bb + aa > 2ab so bb — ba > ba — aa that is 


b-a 
b 


b-a 
a 


(6.14) 


so b > a and this is so. Therefore the lemma is true. 


Note that the expression on the left-hand side of (6.14) was Harriot’s notation for 
(b — a)b. Harriot used this lemma to analyze the different forms taken by a cubic with 
one positive root. He proved the three-dimensional case in connection with a result of 
Viéte. In his De Numerosa Potestatum Resolutione, Viéte discussed a condition for a 
cubic to have three distinct roots:*! “A cubic affected negatively by a quadratic term 
and positively by a linear term is ambiguous [has distinct roots] when three times 
the square of one-third the linear coefficient [of the square term] is greater than the 
plane coefficient [of the first power].” Viéte’s example was x* — 6x” + 11x = 6. Here 


2 
3 ($) > 11 and the roots were 1,2,3. Harriot commented that Viéte’s statement 


required an amendment; in order to get three positive roots, he required that “the cube 
of a third of the coefficient of the square term is greater than the given constant.” This 
would yield the three-dimensional case of the inequality. Harriot went on to give an 
example, showing why Viéte’s condition was inadequate. He noted that aaa — 6aa + 
lla = 12 had only one positive root (namely, 4) even though Viéte’s condition was 
satisfied. In a similar way, he amended Viéte’s remarks for the case of equal roots. 
Harriot stated and proved additional lemmas, of which we give two; he gave the 
comment, “But what need is there for verbose precepts, when with the formulae from 
our reduction, it is possible to show all the roots directly, not only for these cases, but 
for any other case you like. However, if a demonstration of these precepts is required, 


we adjoin the three following lemmas.”? 
b+c+d 
3 
> be+cd+bd and : 5 > bed. 
b+c+d 
8 


These inequalities are particular cases of (6.7) and can be written as 


b a\e b aye 
3() = beded + ba * (-=**) Sheil 


As we noted previously, Descartes made advances over Harriot in terms of notation, 
though he continued to write aa instead of a’; in fact, this practice continued well into 
the nineteenth century as one may see in the work of Gauss, Riemann, and others. The 
notation for the fractional or irrational power was introduced by Newton in his earliest 
mathematical work. 


21 Viate (1983) p. 360. 
22 ibid. pp. 233-234. 
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6.3 Maclaurin’s Inequalities 


Maclaurin’s novel proof of the arithmetic and geometric means inequality is worth 
studying, though it used an unproved assumption on the existence of a maximum. 
The proof consists of two steps, lemmas V and VJ, contained in his 1729 paper on 
algebraic equations.”> 


Lemma V Let the given line AB be divided anywhere in P and the rectangles of the parts AP 
and PB will be a maximum when the parts are equal. 


In algebraic symbols, Maclaurin’s lemma would be stated: If AB = a and AP = x, 
then x(a — x) is maximized when x = 5 for x in the interval O < x < a. 
Maclaurin wrote that this followed from Euclid’s Elements. He then stated and proved 
the following generalization: 


Lemma VI If the line AB is divided into any numbers of parts AC, CD, DE, EB, the product 
of all those parts multiplied into one another will be a maximum when the parts are equal among 
themselves. 


A Cc D E e B 


For let the point D be where you will, it is manifest that if DB be bisected in E, the product 
AC x CDx DE x EB willbe greater than AC x CD x De x eB, because DE x EB is greater 
than De x eB; and for the same reason C E must be bisected in C and D; and consequently all the 
parts AC,CD, DE, EB must be equal among themselves, that their product may be a maximum. 


In other words, Maclaurin argued that if a1,a@2,...,@, are positive quantities not 
all equal to each other and their sum )’a; = A is a constant, then there exist 
ar,@5,...,00, With S? ay = A and ajar, --- aj, > aa2---O,. Thus, if a maximum 


value of the product exists, then it must occur when all the @ are equal. Maclaurin 
assumed that such a maximum must exist; proving this would boil down to showing 
that the continuous function of n — 1 variables 


1012 +++ An—1 (A — | — 2 — +++ — Ap_1) 


has a maximum in the closed domain a; > 0,a2 > 0,...,Q@,-, > 0, a, +a2+ 
-+++@,—1 < A. It was common for eighteenth-century mathematicians to assume the 
existence of such a maximum. Lagrange did this extensively in his derivation of the 
Taylor theorem with remainder. The inequality for the arithmetic and geometric means 
follows from these lemmas. We see that if all values of a; are equal, then aj = 4 and 
we can conclude that 


nay -a, < (A) = (ttert tan 
102 a 7 7 : 


Moreover, equality holds if and only if all the w; are identical. 


23. Maclaurin (1729). 
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6.4 Comments on Newton’s and Maclaurin’s Inequalities 


The purpose of Newton’s inequalities (6.5) and (6.6) was to determine the number of 
complex roots of an algebraic equation 


pox" + (1) pe (5) pox" peep é i i) Pn-1X + Pn =0. (6.15) 


Newton’s inequalities have interesting implications in the case in which all the 
roots of the equation are real and not all identical. In such a case, the inequalities 
are given by 


Ag = pe — pe-1 Pet > 0, «= =1,2,...,2—1. (6.16) 
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We reproduce the essential ideas due to George Campbell** and Colin Maclaurin 


to show that if all the roots of an algebraic equation 
agx" +a,x" | 4+---+a,=0, aj = (") Pi (6.17) 


are real and not all identical, then the inequalities (6.16) hold true. Campbell and 
Maclaurin applied Rolle’s theorem: If the value of a polynomial p(x) is zero at two 
points, then p’(x) has value zero at a point in between those points. They also used the 
fact that if a polynomial p(x) has a zero of order m > 1 at x1, then p’(x) has a zero of 
order m — 1 at x,. Note that x; is a zero of order m for p(x) if p(x) = (x — x1)” q(x) 
and q(x1) 4 0, that is, if (x — x1)" | p(x) but (x — x1)! + p(x). 

Suppose that x = 0 is not a root of (6.17). Then a, 4 0. Now let a1, a2, ...,@, be 
the roots of (6.17). Campbell and Maclaurin then considered the equation whose roots 


Were. te os. + and the derivatives of such an equation, that is, the equation 
n 


a? a2? . 
“nw aay Se ag St (6.18) 


In order to deal with both (6.17) and (6.18) simultaneously, we consider the 
equation in two variables, x and y, 


tig? pax” yao x ey eta” SO, (6.19) 
and take the partial derivatives of (6.19) with respect to x as well as y. Supposing all 
the roots . of (6.19) to be real, we obtain an equation after several partial derivatives 
with respect to x as well as y. It follows from Rolle’s theorem and its implications 
that, if this new equation (obtained after several differentiations) has a root a of order 
m > 1, then a must be a root of order m + 1 of the equation from which the new 
equation is obtained directly by a single differentiation. 


24 Campbell (1728). 
25 Maclaurin (1729). 
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If a, # 0 is assumed, then ~ = 0 is not a root of (6.19) and this implies that 
; = 0 cannot be a multiple root of any equation obtained by partial differentiation of 
(6.19) with respect to x and y. Now take m — 1 derivatives of (6.19) with respect to y, 
followed by n — m — | derivatives with respect to x; one arrives at the result 


(n—m+1)! m! (n — my)! 
(m — 1)! i gtk qe 
(m +1)! 
a (n—m—1)! an4i y- =0. (6.20) 
Divide (6.20) by oom ‘ and set py = — to get 
n 
k 
Pm—1 x? + 2pm xy + Pm4i yy” = 0. (6.21) 


Now note that p,,. and pm+1 cannot both be zero, because in that case the derived 
equation (6.20) would have zero as a multiple root. Hence the quadratic in (6.21) is 
not identically zero and it has real roots. This implies that 


Pui Patt = Bop (6.22) 


where equality holds in case the roots of (6.21) are equal. The considerations on 
Rolle’s theorem show that the roots of (6.21) are equal when all the roots of (6.19) 
are identical, but we have assumed this is not true. Hence 


Pm-1 Pm+1 < De (6.23) 


and Newton’s identities hold. 

As we have seen, Maclaurin did not derive his refinement of the AMGM inequality 
(6.7) directly from Newton’s inequalities. However, the intimate relationship between 
the two sets of inequalities is quite interesting. Observe that, with po = 1 and 


Pi — Pk-1 Pei > 0, (6.24) 


when the points (k, In px) are plotted, the straight lines joining consecutive points have 
decreasing slopes. Observe this by writing y;, = In px and noting that (6.24) can be 
rewritten as~° 


Yel — Yk < Yk — Yk-1- (6.25) 
Now in Figure 6.1, yz — yg_ is the slope of the line joining the points (k — 1, yg_1) 


and (k, y;). Figure 6.1 then shows the plot of the points; note that the lines joining the 
consecutive points have decreasing slopes. The Maclaurin inequalities 


1 
De} > Des k=2,3,...,n 


26 See Steele (2004) p. 180. 
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(0,0) k-1 ok — k+1 


Figure 6.1 Maclaurin’s inequality. 


are equivalent to 


1 1 
= —yp, k=2,3,...,n. 
Rook! > pr n 


Clearly, the slopes of the lines joining the origin to (k, yx), with k = 1,2,3,...,n, 
are decreasing. Although this is obvious from Figure 6.1, a rigorous proof can be 
given by induction. Since po = 1, the case k = 2 holds true because the inequality 


yi > 5 y2 is equivalent to the inequality p2 po < Pi. Now suppose the result true for 
k <n, that is 


1 1 


a i 


i ye-1 When k <n. (6.26) 


Next, note that (6.26) implies that yx_1 > it ye So that 


Yk+1 — Yk < Yk — Yk-1 


k-1 1 
< _ = 
Yk k Yk i Yk 
or 
4 1 k+1 
< = = ; 
Yk+1 < Yk ra Yk ‘3 Yk 


completing the inductive proof of Maclaurin’s inequalities. 


6.5 Rogers 


L. J. Rogers began his 1888 paper,’ “An extension of a certain theorem in inequali- 
ties,” with the remark, 


27 Rogers (1888). 
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I propose in the following pages to show how, by a slight extension of the well-known theorem 
in inequalities concerning the arithmetic and geometrical means of n positive quantities, we can 
deduce many others, including those usually given in text-books. 

The theorem is as follows: 


If aj,a2,...,4n,b1,b2,...,bn be all positive quantities, then 


( + agby +--+ +anbn ie 


> 57 p82... pan, 6.27 
ay tangt---+an etd 52 n ( ) 


Rogers thus began his proof of his interesting theorem (6.27), from which he was 
able to deduce many standard inequalities. He first let a1,a2,...,dn be positive 
integers so that the inequality (6.27) was reduced to the known result (6.2). He 
next took a1,a2,...,d, to represent rational numbers and let N denote the least 
common multiple of the denominators of aj,a2,...,d, so that Na; = Aj, Naz = 
A2, ..., Nay = An, with Aj, A2,...,An integers. He noted that since the inequality 
(6.27) was true for integers Aj, A2,..., An, it followed that 


Ai 1A 
Sb by Se be 


(4 + Azb2 fee e tb 2 
Aj Ab 2 A; 


Taking the positive Nth root of each side then produced the required result. 
He finally took aj,a2,...,d, to be irrational numbers and applied the continuity 
argument. He gave the argument, “Then we may substitute for each of these quantities 
fractions, which may differ from them by less than any assigned quantities, and since 
the theorem is true for the substituted fractions, we may assume it is also true for the 
given incommensurables.” This proved the inequality (6.27) completely. 

Rogers derived a new inequality from (6.27): Form > r >t > 0, 


n m— r—t n m—r 


t n 
Y= ab; < (0 abr S> ajb; (6.28) 


i=l i=l i=l 
To prove (6.28), he denoted Sx = )°y_, ak and first showed that 
So eS (6.29) 


by replacing a; by a} and b; by a;"~" in (6.27) to obtain 


Sin 5: a a m—r 
Sey = (af agi.) (6.30) 


Next, Rogers replaced b; by a; ~’ to get 


S; er a a (r—t) 
= < (a; a, --+) . (6.31) 
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Combining (6.30) and (6.31) yielded 


Sr Sr 
Sin m—r - Sr r-t 
S; ~ \ Sy 


and by taking the S,th root of each side and rearranging terms, Rogers arrived at (6.29) 
and, taking a; = 1, rewrote it as 


n m—t n r—t ra m—r 
(s ut) < (» on) te i) (6.32) 
i=l i=l i=1 


Rogers wrote that (6.32) implied (6.28) in the same way that the case for 
integers, which he called the classical case, implied (6.27). To see this, first take aj, 
i = 1,2,...n to be rational and reduce to the integer case. Finally, take a; to be 
irrational and apply the continuity argument. 

To put these inequalities in more standard form, take the (m — t)th root of each side 
of (6.28) to get 


rot mr 


n n mat n m-—-t 
a; b) < (> ai on) be dj i) (6.33) 
t=! 


i=l i=l 


1 _ r-t 1 _ m-r 1 i eee P_a7ipm ere | 
Let = = ==, and - = 7 —, So that aig 1. Also let A; = a;b;" and B; = a;b; 
so that A; B; = a;b;. 


Now, with p > 0 and g > 0 (6.33) can be rewritten as 


n n 7 n i 1 1 
Y- AB < (>> A?) (>) BF), —+-=1. (6.34) 
i=] i=1 i=l P q 


The integral form of (6.28) was also given by Rogers. Note that the integral form of 
(6.34) would be: Supposing f and g to be positive, integrable functions in [a, b], then 


b b AD i 
i f(x)g(x) dx < (| P(x) dx) (| (dx) : (6.35) 


Inequalities (6.34) and (6.35) are now called Hélder’s inequalities. Observe that the 
integral form (6.35) can be obtained from (6.34) by writing the integrals as the limits 
of sums. 

Rogers applied (6.35) to derive inequalities for the gamma and beta integrals; note 
these integrals are defined 


[o,e) 
T(s) = i xo le dx 
0 
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and 


1 
B(s,t) = / x ax) de 
0 


The inequalities for these integrals were found by Rogers as: 


m—r 


rm)" < (ra) (r@)"", m>r>t (6.36) 
and 


B(l,r)"—* < Bim)" BU t)"", m>r>t: (6.37) 


note that they follow immediately from the integral form of (6.32), equivalent to 
(6.35). 


6.6 Holder 


In his 1889 paper,?® “Ueber einen Mittelwerthabsatz,” Otto Hilder wrote that he had 
a general theorem from which Rogers’s inequalities could be derived and that this 
theorem had connections to the principles of differential calculus. In order to state the 
Rogers’s theorem, we define the concept of a convex function: A function f(x) on 
[a,b] is called convex if the line joining any pair of points on the curve y = f(x) lies 
above the curve as shown in Figure 6.2. 

Let ¢ be any point in [a,b]. The curve PSQ is given by y = f(x) and the 
coordinates of P, S, and Q are respectively (a, f(a)), (t, f(t)), and (b, f(b)). Note 
that ¢ can be written uniquely as t = (1 — a)a + ab, where 0 < a < 1. By means 


Figure 6.2 A convex function. 


28 Holder (1889). 
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of similar triangles, one can show that the coordinates of the point R must be 
(t, 1 —a) f(a) + af (b)). The fact that R lies above the point S can thus be expressed 
by the inequality 


f(U—a)a+ab) < (1—a) f(a) +af(). 


Thus, a function f is convex on [a,b] if, for every pair of points x;,x2 € [a,b], 
we have 


f(A —a@)x1 +ax2) < (1—@) fi) taf), O<a<l. (6.38) 


We may therefore take the inequality (6.38) as the definition of a convex function, a 
definition equivalent to Jensen’s, given in equation (6.8) with accompanying condition. 

Next, suppose that xj < x2 < x3 < +++ < x, aren points in the interval [a, b] 
where f(x) is convex. Let @1,02,...,@n represent n positive numbers whose sum is 
one, that is, aj + a3 +---+a, = |. In that case, 


Sf (yxy + 02x2 + +++ + OnXn) < Oy f (x1) + a2 f (x2) +++ + Om fn) = 1. 
(6.39) 


H6lder proved the lemma (6.39) by induction; clearly, the result is true for n = 2. 
Now assume the result true for n — 1. That means that, given n — 1 positive numbers 
such that By + Bo +---+ By-1 = 1 and given y; < yz <--- < y,_ points in [a, Dd], 
one has 


Ff (Bi v1 + Bo yo + +++ + Bn-1 Yn-1) S Bi FO) + +++ + Bn-1 fOn-1). (6.40) 


Let 


r Oy X1 + O22 Fs ++ + An—1Xn-1 
n—-l| = ’ 
ay +2 + +++ + Op-1 


the inductive hypothesis then implies that 


ay f (x1) ta2 f(x2) +++ + On—1 f n-1) 


Sa < 
ee Oy +2 +++ + On-1 


Now we can see that 
Sp = OX, +++ + On—1 Xn-1 + On Xn = (1 +++ + On—1)Sn—1 + An Xn 
so that 


f (Sn) S (1 +++) + On-1) f (Sn-1) + On f (Xn) 
<a f(X1) +--+ + On f (Xn). 
We can now state Hélder’s basic theorem: Suppose that f(x) is twice differentiable 


in (a,b). Then if f’(x) is increasing in (a,b) [or if f”(x) > 0 in (a,b)], then f is 
convex in (a,b). 
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To prove this, Holder set x1,x2 € (a,b) and let x1 < s < x2 so that there must exist 
positive numbers a1, a2 such that 


a1 X1 + 42 X2 


a, +a) 


He applied the mean value theorem to f(x) on the interval [x,;,s5] to obtain 


f(s) — fx) = f'(e1) (8 — x1) 
= f'(c1) ae Pr Reon (6.41) 


and next applied the mean value theorem to f(x) on [s, x2] to arrive at 


f (x2) — f(s) = f'(c2) 22 — 5) 
= f(t ks a); S<C) <X9. (6.42) 
a| + a2 

Holder multiplied equation (6.41) by a; and (6.42) by az and then subtracted the 

resulting first equation from the second to get 

ay a2 (x2 — X14) 
ay f (x1) + a2 f (x2) — (a +a) f(s) = (f'(e2) — fcr) ———. (6.43) 
a, + a2 
Since cy > c; and since f’ was increasing, the right-hand side of (6.43) would be 
positive and Hélder could conclude that 


a) X1 + a2 X2 


f= F( 


) < 1 fey) + —~ fon) (6.44) 
t x2), é 
~ ay +ag a aj +a2 ‘ 


a, +a2 


completing the proof that f was convex. 
From (6.44), an inductive proof could be given that if # were convex in an interval 


and x1,%2,...,X, Were points in that interval, with a1,a2,...,@, positive numbers, 
then 
n 
A,X, +2 %X2 + +++ + an Xn aj 
i: = y ———— f(x). (6.45) 
a, +da2.+°+:+ay, ayt:::+ay, 


i=1 


Hdlder observed that f(x) = e* was a convex function, because f” (x) = e* > 0. 
Hence (6.45) implied that 


= 
3 
= 


eee) ee (646) 


Now note that, given positive numbers b;, i = 1,2,...,n, there exist numbers x; 
such that e*' = b;. When Hdlder substituted these values in (6.46), the result was 
Rogers’s inequality (6.27). 
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To derive Rogers’s second inequality (6.28), Hélder applied Taylor’s theorem with 
remainder: 


1 
Fv) =f) + Gy —9) fF) + 50 - at Gp, (6.47) 


where o,, lies between o and x,. He multiplied (6.47) by a, and added for all v = 
1,2,...,n to obtain 


n 


Yoav f(r) =f) Dia t fo) avy — 0) + ; YQ — 0) f" (ov). 
v=! vel v=1 


v=1 


(6.48) 
He assumed f to be convex so that f’’(o,) > 0 and thus 


n 


So Gv — 0) fo) = 0. 


v=1 
He then took 
n 
= yet ay Xy 


ei ay 


so that the term )~)_, ay(xy — &) in (6.48) became zero and (6.48) produced the 
inequality 


oO 


Yay) =o ast Ga (6.49) 


v=1 v=1 


To obtain (6.28), Hélder took f(x) = x”, m > 1, so that f’(x) = m(m — 1) 


m—2 


x > 0 for x > 0, showing that (6.49) was true for f(x) = x”, m > 1. Thus, 
(6.49) implied that 


n m-|l m " m 
= «) yaa S ( ay «] (6.50) 
v=1 v=1 v=1 


It is not hard to show, using a suitable change in variables, that (6.50) is equivalent 
to (6.28) and thus to what is now known as Hélder’s inequality: For p,q > O and 
1,1 
i + a 1, 
p44 


+ aie (5:0) (5: n) 


v=1 v=1 


or 


b b af fb i 
[ reacoar = (f fPtx)ds.) e wax) 


where f and g are nonnegative in (a,b). 
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Let us now interpret Rogers’s inequality (6.36), noting that it can be written for 
0<a<las 


log i (ax + (1—a)y) <a@logl(x)+ (l—a)logP(y), x,y > 0. 


This means that log I(x) is a convex function for x > 0. Thus, P(x) would be 
called logarithmically convex and, as we see in Section 17.14, this is one of the 
defining properties of I(x), the others properties being the obvious ones: (1) = 1 
andl (x +1) =xT(). 


6.7 Jensen’s Inequality 


Jensen proved (6.9)? by following Cauchy’s proof of (6.2) in detail. From the 
definition of convexity (6.8), he deduced that 


(x1) + O(x2) + O(x3) + O(x4) > 26 (*) 26 (2) 


> Ag (= a +e). 


He showed by an inductive argument that 


gm gm 
DY br) = 2G | 27 Do xy 
v=1 v=1 


This proved the inequality for the case in which the number of xs was a power 
of two. To prove the theorem for any number of xs, Jensen, still following Cauchy, 
applied Cauchy’s ingenious idea: For any positive integer n, choose m so that 2” > n 
and set 


Xp XI tT Xn 
: . 


Xnt1 = Xn42 = +++ = Xym 


Then 


n re 1 n . 1 n 
S> ov) + 2" —njo (i) >2 (23>) 


vol 


or 


tc eee 
¢ (: ey) < eee) 


29 Jensen (1906). 
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Jensen then used the continuity of @ to get the more general inequality (6.10). He 


supposed a 1,d3, ...,dm to be m positive numbers with sum a, as in (6.10). He chose 
sequences of positive integers 11,N2,...,%m With ny + n2+---+nm, =n such that 
_ A ay . 12 a . Um-1 — am-1 
Him 5. Ut Sy ge 2 Tt = : 
n>o n a n>oo n a n—>Co n a 


Consequently, he could write 


5 Nm am 
lim — = — 
n>o n a 
Now (6.9) implied that 
Nyx, +n2x2 +--+ +NyXx n n n 
6( aa m *) < g(x) + 2b(ar) +--+ 60m); 
n n n n 


from this Jensen got (6.10) by letting n — oo and using the continuity of @. Jensen 
also gave an integral analog of this inequality. He supposed that a(x) and f(x) were 
integrable on (0,1) and a(x) was positive; @(x) was assumed to be convex and 
continuous in the interval (go, g1), where go and g; were, respectively, the inferior 
and superior limits of f(x) in (0,1). Then he had 


Sar \ sees O 
+( ra (2) ): — 


By letting n — ov, he found 


Jo fOr) _ fy a0 (FO) dx 
Ai a(x) dx fi a(x) dx 


6.8 Riesz’s Proof of Minkowski’s Inequality 


Riesz’s derivations*? of Hélder’s and Minkowski’s inequalities were contained in 
his letter to Leonida Tonelli of February 5, 1928. Although Riesz had worked out 
these ideas almost two decades earlier and had presented them in papers, his object 
in this letter was to present proofs of the inequalities without any mention of the 
applications. These proofs are essentially the same as our standard derivations of all 
these inequalities. Here we use the concept of a measurable set £, but E can also 
be replaced with an interval (a,b). Stating that he did this work around 1910, Riesz 
started with 


A“ B'-* <a@A+(l1—a)B, 0<a<1, A>0, BO. 


30 Riesz (1960). 
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This followed immediately from the convexity of the exponential function, but Riesz 


gave a simpler proof. After this proof, he supposed f(x) and g(x) were nonnegative 
functions defined on a measurable set E' such that 


[rrax= fg ax=1, p>. 


PL 
He then took A= f?, B= g?',a= ; to get 


1 —1l 2 
fg<—f?+ —gr; 
P 
thus 
—1 
fgdx<—+ Bee 
E P P 
For general f and g, he replaced f and g by nal 7 and Is! =e 
eines gad] 


respectively, to obtain 


[\elas < 


He next cleverly observed that 


pl 


1 
P Pp 
i lg|P-! dx 
E 


[isiras 


[ortoras | septa tar+ f pkg ae 
E E E 


With f > 0 and g > 0, he had 


1 
[r+erars ( FP dx)’ ( f+ 9)"ax) 
E E E 
1 p=1 
+(f evar)! (fr +vrar) ; 
E E 
Dividing across by (et + g)P dx) Te, he could obtain 


(forteras)’ =(f pear)’ +(f eras)’ 


and this was Minkowski’s inequality, stated and used by Minkowski for sums in 
geometry of numbers. 


p-l 
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6.9 Exercises 
(1) Let s,; = up + uy +---+ Up, where the terms u; are positive. 


(a) Show that In“ < ., 


Sn—-1 Sn-1 


(b) Deduce that 


| | | we ] ] 
T eae | > INS, — NSO. 
SO S] Sn—1 


(c) Prove that if °° un is divergent, then )“"_, st is divergent for w < 1. 
(d) Show that when a > 0, 


Un 
a a (04 
> S$, + as, Un — S =a 


— 5° = (Sn — Un)“ — 8 n Ta" 
Sn 


n—-1 n 
(e) Deduce that if °°°., un is divergent, then )°° , Fac is convergent for 
a > 0. See Abel (1965) vol. 2, pp. 197-98. " 
(2) Suppose p > 1 and a; > 0. Suppose that the series L = pee ajxj converges 
for every system of positive numbers x;(i = 1,2, ...) such that }°?° , x =. 
P 


Use Abel’s result in Exercise | to prove that )°>° , “ae is convergent and that 
p-l 


Do 
Lé< (rE, oP") "| See Landau (1907c). 


(3) Prove that if / is measurable and af 4 | f (x)h(x)| dx exists for all functions 
f € L?(a,b), thenh € Let (a, b). See Riesz (1960) vol. 1, pp. 449-451. 
(4) Show that 


lo oeme.<) b oo 5 ee) 5 
> ae a < 20 aa) (> ) . 
m=1n=1 m=1 n=1 


Hilbert presented this result in his lectures on integral equations. It was first 
published in 1908 in Hermann Weyl’s doctoral dissertation. I. Schur proved 
that the constant 27 could be replaced by z. See Steele (2004). 


(5) Where pj, p2,..., Pn are real, let 
fy) = (H+ ay) + agy)--- (x + any) 


=x" 4+npix" "y+ o) pox" *y? feet (") Pad”: 


(a) Derive the quadratic polynomial obtained by first taking the rth derivative 
of f(x,y) with respect to y and then the (n — r — 2)nd derivative with 
res 

pect to x of fy “(x, y). 
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(b) Use the quadratic polynomial to show that if all a@1,q@2,...,@, are real, 
then p 41 2 PrPr+2. This is in effect the argument George Campbell 


gave to show that if Dod < PrPr+2 for some r, then f(x,1) has at 
least two complex roots. In fact, he did not use the variable y; instead, 
he applied the lemma he stated and proved: 


Whatever be the number of impossible roots in the equation 


gh Bx Gy? ea De 34....4tdx3 ¢ex*t+bx FA=0, 


there are just as many in the equation 


Ax” — bx”! 4 cx"-? — dx” 34...4 Dx3 Cx? + Bx F1=0. 


For the roots of the last equation are the reciprocals of those of the first as is evident 


from common algebra. 


This lemma is also contained in Newton’s Arithmetica Universalis. 
Newton explained that the equation for the reciprocals of the roots of f (x) 


was given by x" f(4) = 0. 


(6) Suppose that 1,02, ...,@, in Exercise 5 are positive. Show that 


po(p1p3)? (p2pa)? «+ (pe-1 pes) < py ps pS -+> pee. 


Deduce Maclaurin’s inequality (6.7) that pe 


and Polya (1967). 


k+1 


1 
1 < Dy - See Hardy, Littlewood, 


(7) Fourier’s proof of Descartes’s rule of signs: Suppose that the coefficients of 
the given polynomial have the following signs: 


Multiply this polynomial by x — p where p is positive. The result is 


The ambiguous sign + appears whenever there are two terms with different 
signs to be added. Show that in general the ambiguous sign appears whenever 
+ follows + or — follows —. Next show that the number of sign variations 
is not diminished by choosing either of the ambiguous signs. Also prove that 
there is always one variation added at the end, whether or not the original 
polynomial ends with a variation, as in our example. Show by induction that 
these facts, taken together, demonstrate Descartes’s rule. Descartes indicated 
no proof for his rule. In 1728, J. A. von Segner gave a proof and in 1741 
the French Jesuit priest J. de Gua de Malves gave a similar proof, apparently 
independently. We remark that de Gua also wrote a short history of algebra 
in which he emphasized French contributions to algebra at the expense of the 
English, in order to counter Wallis’s 1685 history, emphasizing the opposite. 


(8) 


(9) 


(10) 


6.9 Exercises 139 


Fourier presented the method described in this exercise in his lectures at the 
Ecole Polytechnique, soon after its inauguration in November 1794. In 1789, 
Fourier communicated a paper on the theory of equations to the Académie 
des Sciences in Paris but due to the outbreak of the French Revolution the 
paper was lost. In the late 1790s, Fourier’s interests turned to problems of 
heat conduction; it was not until around 1820 that he returned to the theory of 
equations. His book on equations was published posthumously in 1831. 
Gauss’s proof of Descartes’s rule: With his extraordinary mathematical 
insight, Gauss saw the essence of Fourier’s argument and presented it in a 
general form. He supposed 


xT fb Aq” + Aon” |) ee) + Angi 
= (x — p)(x" +ayx""! + agx"? +--+ + an), 


and that the sign changes occurred at dx,,dx,,...,ax,. Show that Ay; = 
ak; — Pak; and that this in turn implies that the signs of Ak; and ax, are 
identical for 7 = 1,2, ...,s. Deduce also that there is an odd number of sign 
changes between Ax,_, and Ax,. Conclude, by induction, that the number of 
sign changes is an upper bound for the number of positive roots and that the 
two differ by an even number. Gauss published this result in 1828 in the newly 
founded Crelle’s Journal. Note that Gauss did not use subscripts; we use them 
for convenience. See Gauss (1863-1927) vol. 3, pp. 67-70. 

Fourier’s extension of Descartes’s rule gives an upper bound on the number 
of real roots of a polynomial f(x) of degree n in an interval (a,b). Suppose 
r is the number of real roots in (a,b), m is the number of sign changes in the 
sequence 


FOF OS Wiese OO 


when x = a, and k is the number of sign changes when x = b. Prove that then 
(m —k) —r = 2p, where p is a nonnegative integer. Descartes’s rule follows 
when a = 0 and b = om. In his 1831 book, Fourier gave a very leisurely 
account of this theorem with numerous examples. 

Ferdinand Francois Budan’s (1761-1840) extension of Descartes’s rule: With 
the notation as in the previous exercise, suppose that m is the number of 
sign changes in coefficients of powers of x in f(x + a), and that k is the 
corresponding number in f(x + b). Then, r < m —k. Prove this theorem and 
also prove that it follows from Fourier’s theorem. Budan was born in Haiti and 
was a physician by training. In 1807, he wrote a pamphlet on his theorem; then 
in 1811 he presented a paper to the Paris Academy. Lagrange and Legendre 
recommended it be published, but the Academy’s journal was not printed 
until 1827, partly due to political problems. With the appearance of Fourier’s 
papers in 1818 and 1820, Budan felt compelled to republish his pamphlet 
with the paper as an appendix. In response, Fourier pointed out that he had 
lectured on this theorem in the 1790s, as some of his students were willing 
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(11) 


(12) 


(13) 


(14) 


(5) 
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to testify. Some of Fourier’s lecture notes from this period have survived; 
they contain a discussion of algebraic equations, in particular Descartes’s rule, 
but they do not discuss Fourier’s more general theorem. See the monograph, 
Budan (1822). 

Let fo(x) = f(x) and fi(x) = f’(x). Apply the Euclidean algorithm to fo 
and /, but take the negatives of the remainders. Thus, 


fo(x) = qi(x) fit) — fax), 
fi) = q(x) fa(x) — fa(x), 


fin—2(*) = Gm—1(%) frn—1(%) — fin (*). 


Consider the sequence fo(x), fi(x),...,fm(x). Prove that the difference 
between the number of changes of sign in the sequence when x = a is 
substituted and the number when x = b is substituted gives the actual number 
of real roots in the interval (a,b). Charles Sturm (1803-1855) published this 
theorem in 1829. Sturm was a great friend of Liouville; they jointly founded 
the spectral theory of second order differential equations. He also worked as 
an assistant to Fourier, who helped him in various ways. See Sturm (1829). 
Let F(x) = Ax? +-+-+ Mx’ + Nx* +.---+ Rx", and let the powers of x 
run in increasing (or decreasing) order. Let m be the number of variations of 
signs of the coefficients and let a be an arbitrary real number. Prove that the 
number of positive roots of x F’(x) — aF (x) = Ois one less than the number 
of positive roots of F(x) = 0. Prove also that if w lies between r and s, then 
the number of sign variations in the coefficients of x F’ — wF is the same as 
the number of sign variations in the sequence A,...,M, — N,..., — R;in 
other words, m — 1. From this, deduce Descartes’s rule and prove that the 
equation 


1 
x x°+x3+x7-1=0 


has at most three positive roots and no negative roots. These results were given 
by Laguerre in 1883. See Laguerre (1972) vol. 1, pp. 1-3. 

Prove de Gua’s observation that when 2m successive terms of an equation 
have 0 as coefficient, the equation has 2m complex roots; if 2m -+ 1 successive 
terms are 0, the equation has 2m + 2, or 2m complex roots, depending on 
whether the two terms, between which the missing terms occur, have like or 
unlike signs. See Burnside and Panton (1960) vol. 1, chapter 10. 

In his book on the theory of equations, Robert Murphy took f(x) = x* — 
6x* + 8x + 40 to illustrate Sturm’s theorem in Exercise 11. Carry out the 
details. See Murphy (1839) p. 25. 

Suppose f(x) is a polynomial of degree n. Prove Newton’s rule that if 
f(a), f(a), ..., f(a) are all positive, then all the real roots of f(x) = 0 
are less than a. Newton gave this rule in his Arithmetica Universalis in the 
section “Of the Limits of Equations.” 


(16) 


(17) 


(18) 
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Following Fourier, let f (x) = x° —3x+—24x3 +95x? — 46x — 101. Consider 
the sequence F< GF Go), ..., f(x), f(x) and find the number of sign 
variations when x = —10,x = —l1,x =0,x = 1, and x = 10. What does 
your analysis show about the real roots of f(x)? Now apply Sturm’s method 
to this polynomial. The tediousness of this computation explains why one 
might wish to rely on Fourier’s procedure. 

Let 


fo(x) = Aox™ + Ayx™ 1} + Aga 2 +e + Amit Am, 


Set fin(x) = Ao, and f(x) = xfi41(4)+Am-_j,i = m—1,m—2,...,0. Prove 
that the number of variations of sign in fin(a), fn—1(@), .--, fo(a), a > 0, is 
an upper bound for the number of roots of fo(x) greater than a; show that the 
two numbers differ by an even number. This result is due to Laguerre. See 
Laguerre (1972) vol. 1, p. 73. 

After his examples of the incomplete rule, Newton moved on to state what 
has become known as Newton’s complete rule for complex roots. In 1865, 
J. J. Sylvester offered a description of this rule: 


Let fx = 0 be an algebraical equation of degree n. Suppose 


2 


1 
fx = agx" +na,x"—! + a — Wagx"~* + +++ +ndy_1 x + an; 


ag, 41,42, ...,4y may be termed the simple elements of fx. Suppose 
Ag = Aj =a; - 
0= 4%: 1 = 4 — 4042, 
z 2 2 
A2 = a5 — a143,... An—1 = 4,1 — 4n—24n, = An = Gy; 


Ao, A1,A2,...,4n may be termed the quadratic elements of fx. dy,a;41 is a 
succession of simple elements, and A;, A;+1 of quadratic elements. 


ar 


4 is an associated couple of elements; 
7 


ar ar+) ‘ : : 
is an associated couple of successions. 
Ar Ar4 


A succession may contain a permanence or a variation of signs, and will be termed 
for brevity a permanence or variation, as the case may be. Each succession in an 
associated couple may be respectively a permanence or a variation. Thus an associated 
couple may consist of two permanences or two variations, or a superior permanence 
and inferior variation, or an inferior permanence and superior variation; these may be 
denoted respectively by the symbols pP,vV, pV, uP, and termed double permanences, 
double variations, permanence variations, variation permanences. The meaning of the 
simple symbols p,v, P, V speaks for itself. 


Newton’s rule in its complete form may be stated as follows: On writing the complete 
series of quadratic under the complete series of simple elements of fx in their natural 
order, the number of double permanences in the associated series, or pair of progressions 
so formed, is a superior limit to the number of negative roots, and the number of variation 
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permanences in the same is a superior limit to the number of positive roots in fx. Thus 
the number of negative roots = or < }° pP ..., positive roots = or < )\ uP. This 
is the Complete Rule as given in other terms by Newton. The rule for negative roots 
is deducible from that for positive, by changing x into —x. As a corollary, the total 
number of real roots = or < > pP + >- vP, that is = or < )> P. Hence, the number of 
imaginary roots 


=or>n—)-P, thatis =or> )°V. 


This is Newton’s incomplete rule, or first part of complete rule, the rule as stated by 
every author whom the lecturer has consulted except Newton himself. 


Read Sylvester’s proof of this rule. Though Newton did not write down a 
proof, Sylvester writes in another paper of the same year, “On my mind the 
internal evidence is now forcible that Newton was in possession of a proof 
of this theorem (a point which he has left in doubt and which has often been 
called into question), and that, by singular good fortune, whilst I have been 
enabled to unriddle the secret which has baffled the efforts of mathematicians 
to discover during the last two centuries, I have struck into the very path which 
Newton himself followed to arrive at his conclusions.” See Sylvester (1973) 
vol. 2, pp. 494 and 498-513. See also Acosta (2003). 


6.10 Notes on the Literature 


Newton’s Arithmetica Universalis, written in 1683, contains his account of the 
undergraduate algebra course he taught at Cambridge in the 1670s. This was partly 
based on Newton’s extensive notes on N. Mercator’s Latin translation of Gerard 
Kinckhuysen’s 1661 algebra text in Dutch. The later parts of the Arithmetica present 
Newton’s own researches in algebra, carried out in the 1660s. This work was first 
published in 1707, in Latin; Newton was reluctant to have it published, perhaps 
because the first portion depended much on Kinckhuysen. An English translation 
appeared in 1720, motivating Newton to make a few changes and corrections and 
publish a new Latin version in 1721. In 1722, the English translation was republished 
with the same minor changes. Whiteside published the 1722 version in vol. 2 of 
Newton (1964-67). Harriot and Stedall (2003) presents Harriot’s original text on 
algebra for the first time, although in English. The 1631 book published as Harriot’s 
algebra was in fact a mutilated and somewhat confused version. Stedall’s introduction 
explains this unfortunate occurrence. 

A good source for references to early work on inequalities is Hardy, Littlewood, 
and Polya (1967), though they omit Campbell. See Grattan-Guinness (1972) for an 
interesting historical account of Fourier’s work on algebraic equations and Fourier 
series. Dieudonné (1981) is an excellent history of functional analysis and covers the 
period 1900-1950, from Hilbert and Riesz to Grothendieck. For functional analysis 
after 1950, see the comprehensive history of Pietsch (2007). 


7 


The Calculus of Newton and Leibniz 


7.1 Preliminary Remarks 


Newton was a student at Cambridge University from 1661 to 1665, but he does 
not appear to have undertaken a study of mathematics until 1663. According to de 
Moivre, Newton purchased an astrology book in the summer of 1663; in order to 
understand the trigonometry and diagrams in the book, he took up a study of Euclid. 
Soon after that, he read Oughtred’s Clavis and then Descartes’s Géométrie in van 
Schooten’s Latin translation. By the middle of 1664, Newton became interested in 
astronomy; he studied the work of Galileo and made notes and observations on 
planetary positions. This in turn required a deeper study of mathematics and Newton’s 
earliest mathematical notes date from the summer of 1664. On July 4, 1699, Newton 
wrote in his 1664-65 annotations on Wallis’s work that a little before Christmas 1664 
he bought van Schooten’s commentaries and a Latin translation of Descartes. He also 
wrote that he borrowed Wallis’s Arithmetica Infinitorum and other works. In fact, his 
meditations on van Schooten and Wallis during the winter of 1664-65 resulted in his 
discovery of his method of infinite series and of the calculus. 

Following the methods of van Schooten’s commentaries, Newton devoted intense 
study to problems related to the construction of the subnormal, subtangent, and 
the radius of curvature at a point on a given curve. Newton’s analyses of these 
problems gradually led him to discover a general differentiation procedure based 
on the concept of a small quantity, denoted by 0, that ultimately vanished. Later in 
life, Newton wrote that he received a hint of this method of Fermat from the second 
volume of van Schooten’s commentaries, although this gave only a brief summary 
based on P. Hérigone’s 1642 outline of Fermat’s method of finding the maximum or 
minimum of a function. Newton found the derivative, just as Fermat had, by expanding 
f(x+o) = f(x) +of'(x)+ 0 (07). Newton realized that the derivative was a powerful 
tool for the analyses of the subtangent, subnormal, and curvature and by the middle 
of 1665 he had worked out the standard algorithms for derivatives in general. Wallis’s 
work motivated Newton to research the integration of rational and algebraic functions. 
Newton combined this with a study of van Heuraet’s rectification of curves; in the 
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summer of 1665, he began to understand the inverse method of tangents, that is, the 
connection between derivatives and integrals. 

Newton left Cambridge in summer 1665 due to the plague, and returned to his home 
in Lincolnshire for two years. This gave him the opportunity to organize his thoughts 
on calculus and several other subjects. He gave up the idea of infinitesimal increments 
and adopted the concepts of fluents and fluxions as the new foundation for calculus. 
Fluents were flowing quantities; their finite instantaneous speeds were called fluxions, 
for which he later used the dot notation, such as x, where x was the fluent. From this 
point of view, Newton regarded it as obvious that the fluxion of the area generated 
by the ordinate y along the x-axis would be y itself. In other words, the derivative of 
the area function was the ordinate. In the fall of 1665, Newton ran into trouble with 
an uncritical application of the parallelogram of forces method, but he soon realized 
his mistake and by the spring of 1666 he was able to apply the method to an analysis 
of inflection points. Note that in 1640, the French mathematician G. Roberval warned 
that a curve could be viewed as the result of a moving point, but that there were 
pitfalls to using the parallelogram of forces method to find the tangent. What was the 
origin of Newton’s conception of a curve as a moving point? A half century later, 
Newton wrote that, though his memory was unclear, he might have learned of a curve 
as a moving point from Barrow. Another possible source was Galileo but Newton did 
not mention him in this connection. In any case, Newton organized his concentrated 
research on calculus into a short thirty-page essay without title; he later referred to it 
as the October 1666 tract, published only in 1967 in the first volume of Whiteside’s 
edition of Newton’s mathematical papers. 

In 1671, Newton wrote up the results of his researches on calculus and infinite series 
as a textbook on methods of solving problems on tangents, curvature, inflection, areas, 
volumes, and arc length. The portions of this work on infinite series were expanded 
from his 1669 work De Analysi. Whiteside designated the 1671 book as De Methodis 
Serierum et Fluxionum because Newton once referred to it this way, but Newton’s 
original title is unknown because the first page of the original manuscript was lost. 
English translations of 1736 and 37 were given the title The Method of Fluxions and 
Infinite Series. Unfortunately, Newton was unable to publish this work in the 1670s, 
though he made several attempts. At that time, the market for advanced mathematics 
texts was not good; the publisher of Barrow’s lectures on geometry, for example, went 
bankrupt. The controversy with Leibniz, causing wasted time and effort, would have 
been avoided had Newton succeeded in publishing his work. 

Newton’s De Methodis dealt with fluxions analytically, but it was never actually 
completed; in some places he merely listed the topics for discussion. However, 
when he revised the text in the winter of 1671-72, Newton added a section on 
the geometry of fluxions, developed axiomatically; he later called this the synthetic 
method of fluxions. Note that in the Principia Newton employed his insightful 
geometric approach. The Nobel Prize winner and Principia expert S. Chandrasekhar 
commented on the mode of Newton’s proofs:! 


! Wali (1991) pp. 242-243. 
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I first constructed the proofs for myself. Then I compared my proofs with those of Newton. The 
experience was a sobering one. Each time, I was left in sheer wonder at the elegance, the careful 
arrangement, the imperial style, the incredible originality, and above all the astonishing lightness 
of Newton’s proofs; and each time I felt like a schoolboy admonished by his master. 


As Newton was completing his researches on the calculus and infinite series, 
Gottfried Leibniz (1646-1716) was starting his mathematical studies. He studied law 
at the University of Leipzig but received his degree from the University of Altdorf, 
Nuremberg in 1666. At that time, he conceived the idea of reducing all reasoning to a 
symbolic computation, although he had not yet studied much mathematics. Leibniz’s 
mathematical education started with his meeting with Huygens in 1672, at whose 
suggestion he studied Pascal and then went on to read Grégoire St. Vincent’s Opus 
Geometricum and other mathematical works. 

From the beginning, Leibniz searched for a general formalism, or symbolic method, 
capable of handling infinitesimal problems in a unified way. In a paper of 1673, 
Leibniz began to denote geometric quantities associated with a curve, such as the 
tangent, normal, subtangent and subnormal, as functions. He began to set up tables of 
specific curves and their associated functions in order to determine the relations among 
these quantities. Thus, he raised the question of determining the curve, given some 
property of the tangent line. In 1673, Leibniz came to the conclusion that this problem, 
the inverse tangent problem, was reducible to the problem of quadratures. By the 
end of the 1670s, Leibniz had independently worked out his differential and integral 
calculus. In 1684, his first paper on differentiation appeared, and in 1686 his first paper 
on integration was published. The notation of Leibniz, including the differential and 
integral signs, gave insight into the processes and operations being performed. The 
Bernoulli brothers were among the first to learn and exploit the calculus of Leibniz 
and in the 1690s, they began to make contributions to the development of calculus in 
tandem with Leibniz. 

In the May 1690 issue of the Acta Eruditorum, Jakob Bernoulli proposed the 
problem of finding the curve assumed by a chain/string hung freely from two fixed 
points, named a catenary by Leibniz. Leibniz was the first to solve the problem, 
announcing his construction without details in the July 1691 issue of the Acta. Johann 
Bernoulli (1667-1748) soon published a solution, in which he explained that he and 
his brother had been surprised that this everyday problem had not attracted anyone’s 
attention. But in his paper, Leibniz wrote that the problem had been well known 
since Galileo had articulated it; moreover, Leibniz stated that he would refrain from 
publishing his solution by means of differential calculus, to give others a chance to 
work out a solution. Jakob had trouble with this question, since he initially thought 
the curve was a parabola, until Johann corrected him. According to a 1717 letter 
from Johann Bernoulli to Montmort, Jakob had initially assumed that the catenary 
was a parabola, as had Galileo. However, when Johann showed his brother the correct 
solution, Jakob was able to extend his brother’s method, developing a general theory 
of flexible strings.” 


2 Spiess (1955) vol. 1, pp. 97-98. 
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In his 1638 work, that we translate as Mathematical Discourses Concerning 
Two New Sciences, Galileo suggested that the catenary was a parabola. In 1646, 
Christian Huygens (1629-1695) showed that it could not be a parabola. In the 
1690s, Huygens offered a geometric solution to the problem posed by Bernoulli, 
using classical methods of which he was a master. In their approach, Leibniz and 
Johann Bernoulli applied mechanical principles to determine the differential equation 
of the catenary, making use of the work of Pardies. The Jesuit priest Ignace-Gaston 
Pardies (1636-1673) published a 1673 work on theoretical mechanics, developing his 
original idea of tension along the string, a concept fully clarified by Jakob Bernoulli. 


Thus, Leibniz and Johann found the differential equation of the catenary: a = 


a 
where s was the length of the curve.* They showed that the solution of this differential 


equation was the integral 


In his 1691 paper, Leibniz presented a geometric figure and explained that the 
points on the catenary could be found from an exponential curve, called by Leibniz 
the logarithmic line. Details of this proof can be found in his letters to Huygens? 
and von Bodenhausen.° In modern notation, the solution would be expressed as 
y= 4(ea + e~2), Johann Bernoulli also failed to publish details but presented two 
geometric constructions of the catenary, one using the area under a curve related to 
a hyperbola and the other using the length of an arc of a parabola. In the 1690s, this 
kind of solution would have been acceptable, because the coordinates of any point on 
the catenary were then described in terms of geometric quantities related to known 
curves such as the hyperbola and parabola. In modern terms, the area and length can 
be written as the integrals 


{/—— and [\7Sa 
¥ (x — a)? — a? x , 


In the 1690s, Leibniz and Johann Bernoulli were arriving at an understanding of an 
exponential function. In a letter of May 1694,’ Bernoulli wrote to Leibniz that he had 
written a paper for the Acta Eruditorum in which he defined the exponential and the 
meaning and construction of x*. He also mentioned that the area under x* over the 
interval (0, 1) was given by the series 


1 eis GAY 


3 For an analysis of Galileo’s work, see Truesdell (1960) pp. 43-47. 

4 Fora summary of Huygens’s paper and that of Leibniz and Bernoulli, see Truesdell (1960) pp. 64-75. 
5 For this letter and references to this and other letters to Huygens, see Truesdell (1960) p. 71. 

® Leibniz (1971) vol. 7, pp. 370-372. 

7 Bernoulli and Leibniz (1745) vol. 1, pp. 5-9. 
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In his reply of June 1694,° Leibniz wrote that he had written to Huygens about 
these matters and he went on to explain that y = x* meant that In y = x Inx or 


fw :y= x f (ax > Xx). (7.2) 


Leibniz then gave the differential form of (7.2) as 
dy:y=dx+dx f (dx:x) 


or 


dy:dx=y(1+ f (dx :.). 


Bernoulli did not offer a proof of (7.1) in his paper, but he included a proof in his 
collected papers.’ See Exercise 10 at the end of this chapter. 


7.2 Newton’s 1671 Calculus Text 


The De Methodis Serierum et Fluxionum'® of Newton began by considering the 
general problem, called Problem 1, of determining the relation of the fluxions, given 
the relations to one another of two flowing quantities. As an example, Newton took 


x3 — ax? +axy—y? =0. (7.3) 


His rule for finding the fluxional equation was to first write the equation in 
i : 3x 2x x 
decreasing powers of x, as in (7.3), and then multiply the terms by =, =, >, and 
0, respectively, to get 


3xx? — 2kax + kay. (7.4) 


Thus, if the term were x” y”, then it would be multiplied by mk He next wrote the 


equation in powers of y: —y?--axy+(x>—ax?) and multiplied the terms by — =, 7 
and 0 to obtain 
—3py? + ayx. (7.5) 


In order to obtain the equation expressing the relation between the fluxions x and 
y Newton added (7.4) and (7.5) and set the sum equal to zero: 


3xx* — 2axx +axy —3yy* +ayx =0. 


8 ibid. pp. 10-13. 
9 Bernoulli, Joh. (1968) vol. 3, pp. 380-381. 
10 Newton (1967-1981) vol. 3, pp. 32-353, especially pp. 74-83. 
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From this it followed that 
xiy= By" —ax): (3x? — 2ax + ay). 


Newton also presented more examples, involving more complex expressions such as 


va2—x? and ,ayt+ x?. 


Explaining why his rule for finding fluxional (differential) equations worked, 
Newton pointed out that a fluent quantity x with speed x would change by xo during 
the small interval of time o. So the fluent quantity x would become x + xo at the end 
of that time interval. Hence, the quantities x + xo and y + yo would satisfy the same 
relation as x and y, and when substituted in (7.3) gave him 


(x3 + 3k0x? + 3x707x 4 %303) (ax? + 2axox + ax*o7) 


+ (axy + axoy + ayox 4 axyo’) (y? | 3yoy" | 3y72o*y + yo") = 0. 
(7.6) 


After subtracting (7.3) from (7.6) and dividing by 0, Newton had 


3xx° + 3x70x + x30? — 2axx — ax*o + axy + ayx 


taxyo —3yy? — 3y"oy — y3o* = 0. 


Here Newton explained that quantities containing the factor o could be neglected,!! 
“since 0 is supposed to be infinitely small so that it be able to express the moments of 
quantities, terms which have it as a factor will be equivalent to nothing in respect of 
the others. I therefore cast them out and there remains 


3xx° — 2axx +axy + ayx —3yy? = 0.” 


Note that this amounts to the result of implicit differentiation with respect to a 
parameter. Actually, Newton here used the letters m and n for x and y, respectively. 
He introduced the dot notation in the early 1690s. From this he had the slope 


yi x = 3x7 —2ax + ay :3y? —ax. (7.7) 


Observe that to construct the tangent, rather than work with slope, it is better to 
find the point where the tangent intersects the x-axis and join this point to the point 
of tangency on the curve. Now if (x, y) is the point on the curve, then the length of 
the segment of the x-axis from (x,0) to the intersection with the tangent is given by 


the magnitude of =. In his discussion of the tangent, Newton computed this quantity 
ax 
to obtain 


11 ibid. p. 81. 
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3y? —axy 
3x2 —2ax +ay 


In the section of his book on maxima and minima, Newton gave a method and then 
two examples and nine exercises to be solved using that method. He never completed 
this section of his book. To find a maximum or minimum, he explained, the derivative 
should be set equal to zero at an extreme value: !* 


When a quantity is greatest or least, at that moment its flow neither increases nor decreases; for if 
it increases, that proves that it was less and will at once be greater than it now is, and conversely 
so if it decreases. Therefore seek its fluxion by Problem 1 [above] and set it equal to nothing. 


In the first application of this principle, he sought the greatest value of x in equation 
(7.3) by setting x = 0 in the fluxional equation (7.4) to get 


~3y? +ax =0; (7.8) 


using this result in the original equation, one obtains the largest value of x. Newton 
remarked that equation (7.8) illustrated the “celebrated Rule of Hudde, that, to 
obtain the maxima or minima of the related quantity, the equation should lie ordered 
according to the dimensions of the correlate one and then multiplied by an arithmetical 
progression.” He added that his method extended to expressions with surd quantities, 
whereas the earlier rules and techniques did not. As an example, he gave the problem 
of finding the greatest value of y in the equation 


b 3 
x? —ay* 4 ud x*,fay +x2 =0. 


at+y 


Newton wrote that the equation for the fluxions of x and y would come out to be 
Babyy? +2byy> = 4axxy + 6x3 + ayx? 
a? + 2ay + y? 2/ay + x? 


He then observed that by hypothesis y = 0 and hence, after substituting this in the 
equation and dividing by xx, 


3xx? — ayy 4 = 0. 


2ay + 3x? 
J (ay + x*) 


Newton noted that this equation should be used to eliminate x or y from the original 
equation; the maximum would be obtained by solving the resulting cubic. 

The next section of Newton’s book discussed the problem of constructing tangents 
to curves and he mentioned seven problems solvable by the principles he explained. 
For example: !3 


3x =0 or 4ay+3x* =0. 


12 ibid. p. 117. 
13 ibid. p. 149. 


150 The Calculus of Newton and Leibniz 


(1) To find the point in a curve where the tangent is parallel to the base (or any 
other straight line given in position) or perpendicular to it or inclined to it at 
any given angle. 

To find the point where a tangent is most or least inclined to the base or to 
another straight line given in position — to find, in other words, the bound 
of contrary flexure. I have already displayed an example of this above in the 
conchoid. 


(2 


wm 


By “the bound of contrary flexure” Newton meant the point of inflection and at this 


point, ay = 0. In the example of the conchoid of Nichomedes, defined by 


yx = (b+ y)/c? — y?, 


Newton actually minimized the x-intercept of the tangent given by 
x-y—. (7.9) 


Whiteside has noted!* that in 1653, Huygens determined the inflection points of 
this conchoid by this method, using Fermat’s procedure to obtain the minimum value. 
Newton was most likely aware of Huygens’s work and wanted to show that calculus 
algorithms could simplify Huygens’s calculation. It should be noted that Huygens’s 
criterion that the inflection points in general could be obtained by minimizing (7.9) is 
false, though it is true for the conchoid. 

Newton was very interested in problems related to curvature and intended to devote 
several chapters to the topic, but many of these are barely outlines. However, he 
presented a procedure for finding radius of curvature, and we explain this later on. 
He also included sections on arc length and the area of surface of revolution. 


7.3 Leibniz: Differential Calculus 


Leibniz gave a very terse account of his differential calculus in his 1684 Acta 
Eruditorum paper,'> starting with the basic rules for the differentials of geometric 
quantities (variables). Leibniz’s approach was not to find the derivative of a function. 
As he conceived things, geometric quantities had differentials; when the quantities 
stood in a certain relationship to one another, then the differentials also satisfied certain 
relations. To determine these relations, for a constant a and variable quantities v, x, y, 
etc., he stated the rules for the differentials dv,dx,dy, etc.: 


da=0, d(ax)=adx, d(iz-—y+tw+x)=dz-—dy+dw+dx, 


dyxyd 
Ronee. 2 Soe 
y yy 


14 ibid. vol. 2, pp. 198-199, footnote 9. 
!5 Leibniz (1684). For an English translation, see Struik (1969) pp. 272-280. 
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Concerning signs of differentials, Leibniz explained that if the ordinate v increased, 
then dv was positive, and when v decreased, dv was negative. 

In only one paragraph, Leibniz described in terms of differentials: maxima and 
minima, concavity or convexity, and inflection points of curves. He explained that at a 
maximum or minimum for an ordinate v, dv = 0 since v was neither increasing nor 
decreasing. For concavity, the difference of the differences d dv had to be positive, 
and for convexity d dv had to be negative. Note here that Leibniz took the definitions 
of concavity and convexity to be the reverse of our present definition. At a point of 
inflection ddv = 0. After this, Leibniz gave the rules for the differentials of powers 
and roots, that is 

dx* =ax*'dx, and di/x@ = —V/ x4, (7.10) 

He wrote that with this differential calculus, he could solve problems dealing with 
tangents and with maxima and minima by a uniform technique, lacking in the earlier 
expositions. To demonstrate the power of his method, he found the tangent to a curve 
defined by a complicated algebraic relation between the variables x and y. As another 
application of the differential calculus, he gave the derivation of Snell’s law in optics, 
one of the standard examples in modern textbooks. 

As a final example, Leibniz considered the problem that de Beaune proposed to 
Descartes in 1639. Florimond de Beaune (1601-1652) was a jurist who carefully 
studied Descartes’s book on geometry. He observed that, though Descartes had given 
a method for finding the tangent to a curve, he had not indicated how to obtain the 
curve, given a property of the tangent. One of de Beaune’s problems was to find the 
curve for which the subtangent was the same for each point of the curve. This problem 
translates to the differential equation ay a z, where a is a constant. It is well known 
that the solution is Iny = ix +c and Descartes came close to a solution.!® In the 
course of his work, he obtained, without mentioning logarithms, particular cases of 
the inequality, written in modern notation as 

1 1 1 m 1 1 1 


Para =n aoe ae 
n n+l m— 1 n n+l n+2 m 


To tackle this problem, Leibniz described the differential equation by saying that y 
was to a as dy was to dx. He then noted that dx could be chosen arbitrarily and hence 
could be taken to be a constant b. Then 

b 

dy=-y, or y= o dy. 

a b 
He observed that this implied that if the x formed an arithmetic progression, then the 
y formed a geometric progression. Leibniz did not explain or prove this statement, but 
it is easy to check that if 


16 Descartes (1897-1905) vol. 2, pp. 510-519. For a complete discussion of Descartes’s solutions, see 
Scriba (1961). See also Hofmann (1990) vol. 2, pp. 279-281. 
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a 
yx) = 5 dy(x), then 


b 
y(x + dx) = y(x) + dy(x) = (1 + -) y(x). 


Again, 
y(x +2dx) = y(x + dx) + dy(x + dx) 
b 
= yx + dx) + —y(x + dx) 
a 
b 
= (1 + -) y(x + dx) 
a 
b\2 
= (1 + °) y(x). 
a 
Similarly, 


b 
y(x + 3dx) = y(x + 2dx) + — dy(x + 3dx) 
a 


b 3 
= (1 oe -) y(x), 
a 


and in general 


y(x +ndx) = (1 + -) y(x). 


This proves Leibniz’s claim, since he took dx to be a constant; thus, x, x + dx, x + 
2dx,... is an arithmetic progression, and the values of y at these points form a 
geometric progression. This idea illustrates the seventeenth-century understanding of 
logarithms. In fact, Leibniz was already suggesting that the logarithm be defined by 
means of the integral / de and later on, he did so. 

Leibniz could have integrated to get ny = [ a = 1 fdx = +, but in his 1684 
paper, he did not use or discuss integration, though he had been aware of it for several 
years. Surprisingly, he gave a brief exposition of his ideas on integration in a review 
of John Craig’s 1685 book on quadrature.!’ In the review,'* Leibniz introduced the 
symbol /' for the summation of infinitesimal quantities and gave an illustration of its 
power when used in conjunction with differentials. Leibniz also pointed out that the 
integral symbol could be used to represent transcendental quantities such as the arcsine 
or logarithm, in such a way that it revealed a property of the quantity. 

In his 1684 paper on the differential calculus, Leibniz gave a derivation of Snell’s 
law by applying Fermat’s principle of least time. 


'7 Craig (1685). 
18 Leibniz (1971) 3/IL, pp. 226-235. 
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v 


Cc 


Figure 7.1 Leibniz’s figure for derivation of Snell’s law. 


In Figure 7.1, with the lines PC and QE perpendicular to SQ, light traveled from 
point C to point £ and the line QP separated an upper medium of density r from a 
lower medium of density h. Leibniz explained that density should be understood to be 
with respect to the resistance to a ray of light. 

Let OF = x, OP = p, CP =c, and EQ =e. Then 


FC= Vcc + pp — 2px + xx = (in short) V1, 
EF = Jee +xx = (inshort) /m. 


Leibniz gave the quantity to be minimized when the densities were taken into 
account as w = hv/J + r./m. He then argued that to minimize, set dw = 0, to 
obtain 


0 = hdl :2V1+rdm:2/m. 


Note that Leibniz specified that he would denote 7 by x : y. He then observed that 
dl = —2(p — x) and dm = 2xdx; hence he had Snell’s law: 


h(p—x):Vl=rx:Jm. 


7.4 Leibniz on the Catenary 


Leibniz developed a theory of second- and higher-order differentials in order to apply 
differential calculus to geometry and mechanics. In his applications, including the 
catenary problem,!? he often took one of the variables, say y, to be such that the 
second-order differential ddy was 0 or, equivalently, that the first-order differential dy 
was a constant. This amounted to taking y to be the independent variable. To describe 
the curve of the catenary, Leibniz used Pardies’s important mechanical principle, 


19 Leibniz (1971) vol. 5, pp. 243-247. 
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C 
A 
B 
Figure 7.2 Pardies’s theorem. 
B Cc 
A T B 


Figure 7.3 Leibniz’s figure of catenary, made for Huygens. 


dating from 1673, that for any portion AC of the curve made by the string, the vertical 
line through the center of gravity of AC, and the tangents at A and at C intersected at 
one point”? (Figure 7.2). 

Leibniz’s letters to Huygens and Bodenhausen offered the following details of 
his derivation of the catenary.”! In Leibniz’s Figure 7.3, A is the lowest point of 
the catenary; CT is the tangent at a point C on the catenary; and C6,AB are 
perpendicular to AB, the tangent at A. We follow Leibniz’s notation and argument.” 
Let AB = x,BC = y,TB = x dy: dx and AT = y —xdy: dx. Then, iy ees S 
theorem, the y coordinate of the center of gravity of the arc AC of length c is = if yde. 
Thus, 


1 
~ | yde=y—xdysdx, (7.11) 
c 


Now multiply both sides by c and differentiate to get 


yde =cdy+ydce—x dy: dx dc — cdy — cxd, dy: dx. (7.12) 


Note that Leibniz used a comma to separate the operator d from the quantity 
dy : dx, where the line above the expression was used instead of parentheses. Upon 
simplification, obtain in Leibniz’s notation 


dcdy : dx +cd, dy: dx =0. (7.13) 


20 See Truesdell (1960) pp. 50-53. 

21 For Leibniz’s letters to Bodenhausen on the catenary, see Leibniz (1971) vol. 7, pp. 359-361 and 
pp. 370-372. 

22 See Truesdell (1960) pp. 71-72. 
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Suppose that y increases uniformly, so that dy is constant and ddy = 0. This 
implies by the quotient rule that 


d, dy : dx = —dyddx : dx dx, 
so that (7.13) is transformed into dcdx —cddx = 0. 
By differentiating dx : c = dy : a we get the previous equation, indicating that this 
is the integral of that equation. Rewrite this integral as 
adx =cdy. (7.14) 
This is the differential equation of the catenary, and its differential is 


addx = dcdy. (7.15) 


Following Leibniz, one may solve this equation by observing that in general, since c 
denotes arc length, 


dcdc =dydy+dx dx. (7.16) 


Differentiate this, using ddy = 0 and (7.15), to obtain 
dy 
dcddc = dyddy +. dx ddx = dx ddx = dx dc —. 
a 
By integration (Leibniz used the term summation), we arrive at adc = (x + b)dy, 
where b is a constant. Next set z = x +b to rewrite, obtaining adc = z dy. Combining 
this with dc dc = dz dz + dy dy, the result emerges as 


aadzdz+aadydy = zzdydy. (7.17) 


Thus, as Leibniz wrote, 


yoaf dz: /zz — aa, (7.18) 


or in modern notation 


/ dz 
Go) 2 
Vz? — a? 
gives the area under the curve with ordinate oe This integral can be computed 
in terms of the logarithm. Although we today would wish to evaluate the integral, 
and write it as the logarithm of a specific function, mathematicians of the seventeenth 
century were satisfied with a result expressed in terms of areas or arc lengths of known 
curves, so that from Leibniz’s point of view, this result was sufficient to define the 
catenary. 

We remark that the meaning of the exponential curve, or exponential function as 
we call it, and of the logarithmic function, was not clearly understood in the 1690s. 


156 The Calculus of Newton and Leibniz 


Leibniz’s paper on the catenary~? thus devoted some space to these curves. In his 
correspondence with Huygens, Leibniz was called upon to explain these concepts.”* 
Again, |’ Hépital asked Johann Bernoulli the meaning of m”, since the prevailing point 
of view saw the magnitudes m and n as represented by lines.” To clarify these matters, 
in 1697 Johann Bernoulli published a paper in the Acta Eruditorum, “Principia calculi 
exponentialium seu percurrentium.””° In this paper, Bernoulli defined the logarithmica 
curve as one whose subtangent was a constant. Since the subtangent is given by 


as 
dy’ 
dx 


and taking the constant to be one, we essentially have 


y = exp (x) 


and the logarithmica is the exponential curve. Bernoulli applied the logarithmica to 
define y = x* and showed how to construct this curve pointwise. In effect, he set x* = 
exp (x Inx). In this way, we see how Leibniz and the Bernoullis and their followers 
found it necessary to work with formulas instead of geometric objects. 


7.5 Johann Bernoulli on the Catenary 


In his 1691-1692 lectures on integral calculus, Bernoulli gave details to supplement 
the treatment of the catenary in his 1691 paper.*’ He first set down the mechanical 
principles required to obtain the fundamental equation. 

In Figure 7.4, Bernoulli set BG = x, GA = y, Ha=dy, HA =dx,and BA=s. 
He then applied the laws of statics, and in effect Pardies’s law, to obtain the differential 


dx ks KY 
A os SE Gee (7.19) 
dy ka a 
for some constant a. Bernoulli’s solution, like that of Leibniz, amounted to showing 
that (7.19) was equivalent to 


adx 
To show this, he wrote (7.19) as 
2 
ays —— or dy = 


23 Leibniz (1971) vol. 5, pp. 243-247. 

24 See Bos (1996) and Truesdell (1960), pp. 64-72. 
25. Spiess (1955) vol. 1, p. 172. 

26 Joh. Bernoulli (1968) vol. I, pp. 179-187. 

27 Joh. Bernoulli (1742) vol. 3, pp. 404-406. 
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Therefore, 


and 


By integration, 


x=VJs2?+a2 or s=Vx2-a?. 


By differentiating, he got 


Squaring this, he had 


equivalent to (7.20). 


7.6 Johann Bernoulli: The Brachistochrone 


In 1696, Bernoulli conceived of and solved the brachistochrone problem:7° Given two 
points in a vertical plane but not vertically aligned, find the curve along which a point 
mass must fall under gravity, starting at one point and passing through the other in 
the shortest possible time. He argued that this mechanical problem was identical to an 
optical problem of the path taken by light moving from one point to another, following 


28 Bernoulli, Joh. (1697). For an English translation of the main points of the paper, see Struik (1969) 
pp. 392-396. 
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Figure 7.5 Bernoulli’s diagram to derive the brachistochrone. 


the curve of least time, passing through a medium whose ever-changing density is 
inversely proportional to the velocity of a falling body. As light passes continuously 
from one medium to another, the quantity sin remains constant, where a is the angle 
between the vertical and the direction of the path and v is the velocity. 

We change Bernoulli’s notation slightly; he used ¢ for the velocity and interchanged 
the x and y. So in Figure 7.5, let AC = y,CM = x,mn = dx,Cc = dy,Mm = ds, 


and a = ZnMm. Since sin “ is aconstant, we have 


2 = i or a’dx* = v* (dx? + dy’), 
where a is a constant. Since for a falling body v* = 2gy, we get dx = aa dy, 
where c = e. Now 
y 1 cdy 1 cdy —2ydy 


OVe=y 2 Yey—y 2 Jey—y 
Integrating this, obtain CM = arc GL—LO and, since MO = CO—arcGL+LO = 
arc LK + LO, it follows that WL = arc LK. Thus, the curve is a cycloid. 

Bernoulli was particularly proud of having linked mechanics with optics. In his 
brachistochrone paper he wrote,?? “In this way I have solved at one stroke two 
important problems — an optical and a mechanical one — and have achieved more 
than I have demanded from others: I have shown that the two problems, taken from 
entirely separate fields of mathematics, have the same character.” Bernoulli mentioned 
the link between geometrical optics and mechanics more than once in his works, but 
this concept was not developed until the 1820s when the Irish mathematician William 
Rowan Hamilton independently worked out the same idea. 


7.7 Newton’s Solution to the Brachistochrone 


In 1696, although he had already solved the problem, Johann Bernoulli made a public 
challenge of the brachistochrone problem and another problem, perhaps directed at 


29 Struik (1969) p. 394. 
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Cc P A 


D 


Figure 7.6 Newton’s solution to the brachistochrone problem. 


Newton. At that time, Newton was serving in London as Warden of the Mint, having 
given up mathematics. However, upon receiving these problems after a full day’s work, 
he set upon them immediately and reportedly solved both problems within twelve 
hours. Whiteside commented that, although this was a marvelous feat, Newton was 
then out of practice, and thus he took hours instead of minutes.2? We note that in 1685, 
Newton had addressed a problem mathematically similar to the brachistocrone, of the 
solid of revolution of least resistance in a uniform fluid; his solution was included 
in the Principia. In 1697, Newton published a short note solving both of Bernoulli’s 
problems, with an accompanying diagram in the Philosophical Transactions, stating 
that the solution to Bernoulli’s brachistochrone problem was a cycloid. Then in 1700, 
he wrote up the details, apparently for the purpose of explaining the solution to David 
Gregory, nephew of James Gregory. In his brief note in the Transactions, Newton gave 
Figure 7.6 and stated: 


From the given point A draw the unbounded straight line APCZ parallel to the horizontal and 
upon this same line describe both any cycloid AQP whatever, meeting the straight line AB 
(drawn and, if need be, extended) in the point Q, and then another cycloid ADC whose base 
and height [as AC : AP] shall be to the previous one’s base and height respectively as AB to AQ. 
This most recent cycloid will then pass through the point B and be the curve in which a heavy 
body shall, under the force of its own weight, most swiftly reach the point B from the point A. 
As was to be found. 


We summarize Newton’s solution for David Gregory, based upon his diagram 
(Figure 7.7) and Whiteside’s commentary.*! Let AB = x, BC = 0 = CD, BE = 
y(= y(x)). By Taylor’s expansion 


: 1. 
CN =y@to)=y+yot 5 So, 


DG = y(x + 20) = y+2yo+ 2907. 


From this it follows that 
: 1. 2 : 3. 2 : 00): 
HN=I1IK=yo+ red ,IG=yot+ 79° , and LG = 2yo + 2yo". 
Define p and g by FN =q andGL = 2p. 


30 Newton (1967-1981) vol. 8, pp. 72-73, footnotes | and 2. 
31 Newton (1967-1981) vol. 8, pp. 87-91. 
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Figure 7.7 Newton’s solution for David Gregory. 


The time taken to travel from EF to G is to be minimized as g varies. The expression 
for time is given by 


ve+(p—-g? , voer+ (p+) _ 


R+S, 
ale Jorx 
where 
2 ny? 2 2 
pe pe ee 
x Oo+Xx 
Taking the derivative with respect to q, 
. —2pq+2qq . 2pq+2qq 
IRR= ce Sia Per es 1 ee hc a eb 
x x+o 


So the condition for minimum time is that 


—pq+4qq Pa+4d _ 
Rx ' S(x +0) 


’ 


or 


(P-DvVx_ _ (Ptagvx+o 
Vip-gr to V(ptaqyr +e 


This condition implies that —2 v* _ must be a constant and since 2 = x, we have 


Vp +02 P 
af ® 


+e? 


Thus, dy = ,/*~ dx, and we have the differential equation of a cycloid. 


d 2 

x x 

= constant or 1+ (=) =-. 
dy c 
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Figure 7.8 Newton’s derivation of the radius of curvature. 


7.8 Newton on the Radius of Curvature 


From the time he studied van Schooten’s book on Descartes’s Géométrie, Newton was 
interested in the problem of finding the radius of curvature at a point on the curve. 
In 1664-1665, he grappled with this question, and after several attempts he found a 
solution. In the 1737 anonymous translation of his treatise on fluxions and series, he 
gave four items to consider in this connection, we present the second and third:** 


There are few Problems concerning Curves more elegant than This, or that give a greater Insight 
into their nature. In order to its resolution, I must premise the following general considerations. 


II. If a Circle touches any Curve on its concave side in a given point, and its magnitude be such 
that no other Tangent Circle can be interscribed in the Angle of contact nearer that point, that 
Circle will be the same Curvature as the Curve is of in that point of contact. For that circle which 
comes between the curve and another Circle at the point of contact, varies less from the Curve and 
makes a nearer approach to its Curvature, than that other Circle does; and therefore that Circle 
approaches nearest to its Curvature, between which and the Curve no other Circle can intervene. 


Ill. Therefore the Center of Curvature at any point of a curve, is the Center of a Circle equally 
curved, and thus the Radius or Semi-diameter of Curvature is part of the perpendicular which is 
terminated at that Center. 


After some discussion of properties of the center of curvature, he described one 
method for finding the radius of curvature by constructing normals at two infinitely 
close points, D and d. The intersection of the normals gave the center C of the circle 
of curvature and therefore C D was the radius of curvature. Referring to Figure 7.8, he 
explained how to find CD. 


At any point D of the Curve AD, let DT be a Tangent, DC a Perpendicular, and C the Center 
of Curvature, as before. And let AB be the Absciss, to which let DB be applied at right angles, 
which DC meets in P. Draw DG parallel to AB, and CG perpendicular to it, in which take 
Cg of any given magnitude, and draw gé perpendicular to it, which meets DC in 6. Then it 
will be Cg : gd :: (TB: BD ::) as the Fluxion of the Absciss to the Fluxion of the Ordinate. 
Likewise imagine the point D to move in the Curve an infinitely little distance Dd, and drawing 
de perpendicular to DG, and Cd perpendicular to the Curve, let Cd meet DG in F, and 5g 
in f. Then will De be the momentum of the Absciss, de the momentum of the Ordinate, and é6f 


32 Newton (1964-1967) vol. 1, pp. 81-85. 
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the contemporaneous momentum of the RightLine gé. Therefore DF = De + dexde Having 
therefore the ratios of these momenta, or which is the same thing, of their generating Fluxions, 
you will have the ratio of GC to the given line Cg, which is the same as that of DF to 5f. And 
thence the point C will be determined. 

Therefore let AB = x, BD = y,Cg = 1, and gé = z. Thenit willbe 1:7: x: y,orz= x. 
Now let the momentum 6f of z be z x 0, (that is the product of the velocity and of an infinitely 


small quantity o0,) therefore the momentum De = x x o, de = y xo, and thence DF = xo+ a 
Therefore it is 


ei yyo : 
Cg): CG:: (6f : DF ::)zo : xo + —, that is, 
x 


CG = ea And whereas we are at liberty to ascribe whatever velocity we please to the 
Fluxion of the Absciss, to which as to an equable Fluxion the rest may be referred, make x = 1, 


- i a 
and then y = z, and CG = 1; whence GD= ate; and DC = Hoyle vee 


7.9 Johann Bernoulli on the Radius of Curvature 


In 1691, Guillaume |’H6pital (1661-1704) met Johann Bernoulli, who informed him 
that he had found a formula for the radius of curvature. A keen student of mathematics, 
l’H6pital was fascinated, and requested Bernoulli give him a course of lectures. 
In 1691-92, Bernoulli delivered these lectures, in which was necessarily included 
an elaboration of the calculus. L’H6pital proceeded to write his famous differential 
calculus textbook, popular for a century. Bernoulli included his integral calculus 
lectures as Lectiones Mathematicae in vol. 3 of his Opera Omnia;>> he mentions 
l’H6pital in the subtitle of the lectures. The derivation of the formula for the radius 
of curvature is contained in Lecture 16 of this work. 

In Figure 7.9, the lines OD and BD are radii normal to the curve, with O and B 
infinitely close, so that BD is the radius of curvature. Bernoulli let. AE = x, EB = y 
with BF = dx, FO = dy. Then he could write 


and therefore 


2 2 
pcu + dy 
dx 
Now BF: FO=BE: EH, so that 
dy yy dx? + dy? dy 
EH=y—, BH=——_—, AH=x+t+y—; 
dx dx dx 


33 Bernoulli, Joh. (1742) vol. 3, pp. 187-193. 
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D 


Figure 7.9 Bernoulli’s figure for the radius of curvature. 


taking d*x = 0, the differential of AH could be written as 
dy* + yd’y 
dx : 


Then because BC: HG = BD: HD,he had (BC— HG): BC = BH: BD and 
Bernoulli obtained the formula for the radius of curvature: 


_ (dx* + dy”),/dx2 + dy? 


—dxd*y 


HG=dx+ 


BD 


7.10 Exercises 


(1) Let b be a root of y? + a7y — 2a? = 0. Show that if y? + a?y + axy — 2a 
x? = 0 and a + 3b? = c’, then 


abx  a*bx?— x3 abx? = abx? — a>bx? 
ae 
3 a cl0 


(2) Suppose y> + y? + y— x3 =0 where x is known to be large. Show that 


1 2 7 5 ' 
a t } etc. 
oO 3 Ox 81x? 81x3 


See Newton (1964-1967) vol. 1, pp. 46-47 for the above two exercises. 
(3) Show that if » = 3xy3 + y, then y3 = dx? 4 2x34 Jo e4 etc. See 
Newton (1964-1967) vol. 1, p. 63; see p. 66 for the next exercise. 
(4) In a given triangle, find the dimensions of the greatest inscribed rectangle. 
(5) Show that in the parabola ax = yy, the point at which the radius of curvature 


is of length f is given by x = —jZa + JJ qaf?. 
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(6) Find the locus of the center of curvature of the parabola x? = ay and of the 
hyperbola (of the second kind) xy? = a>. See Newton (1964-1967) vol. 1, 
p. 87 for this exercise and p. 85 for the previous exercise. Note that Newton 
called a polynomial equation y = p(x) a parabola. Similarly, he called y = 


a 


aR a hyperbola of the second kind. 


(7) Find the asymptotes of the curve y= v= axy. See Stone (1730) p. 19. 

(8) Take a point E on the line segment AB. Find F such that the product of the 
square of AE times EB is the greatest. See Stone (1730) p. 58. Recall that 
this part of Stone’s book was a translation of 1’ Hépital’s differential calculus 
book. 

(9) Find the volume of a parabolical conoid generated by the rotation of the 
parabola y” = x about its axis. See Stone (1730) p. 121 of the appendix. 

(10) Show that 


1 3 3-2 : 
[emsas = qe in’ x - pr in’x + ae nx SS 
More generally, find f x’" In” x dx. From this result, deduce that 


Pusat |r ans ame Da 
x* dx =1 7 + etc. 
0 


See Joh. Bernoulli (1968) vol. 3, pp. 380-381. 


7.11 Notes on the Literature 


Even though Newton was unable to publish his 1671 calculus treatise, the text was 
published several times, starting in the 1730s, in both Latin and English. However, 
Whiteside found that the translations were not completely adequate. Consequently, in 
vol. 3 of Newton (1967-1981), Whiteside presented his own translation accompanied 
by Newton’s Latin text. Truesdell (1960) presents an interesting commentary on the 
work of Pardies, Leibniz, Huygens, and Jakob and Johann Bernoulli relating to the 
catenary. See especially pp. 64—75. 

We have not dealt with Leibniz’s higher differentials in any detail. An interesting 
account appears in Bos (1974). Euler showed that the complicated theory of higher 
differentials could be avoided by using dependent and independent variables. An 
English translation of Bernoulli’s brachistochrone paper appears in Struik (1969), 
pp. 392-396. Simmons (1992) gives an entertaining account in modern terminology. 
The reader may enjoy seeing Knoebel, Laubenbacher, Lodder, and Pengelley (2007) 
for their discussion of Newton’s derivation of the radius of curvature. They review 
this work of Newton in the context of the general notion of curvature and include 
a discussion of the ideas of Huygens, Euler, Gauss, and Riemann on the topic. 
Bourbaki (1994) contains a deep but concise summary of the development of the 
calculus; see pp. 166-198. 
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De Analysi per Aequationes Infinitas 


8.1 Preliminary Remarks 


Newton’s groundbreaking paper, revealing the power of infinite series to resolve 
intractable problems in algebra and calculus, was probably written in the summer of 
1669. Before Newton, the only infinite power series to be studied in Europe, besides 
the infinite geometric series, was the logarithmic series, by J. Hudde and N. Mercator. 
In the winter of 1664-1665, inspired by Wallis’s work on the area of a quadrant 
of a circle, Newton considered the more general problem of finding the area under 
y = V1 — ?? on the interval (0,x), for x < 1.! This question led Newton to make the 
extraordinary inquiry into the value of (1 —12)2 in powers of tf; Newton thus discovered 
the binomial theorem, first for exponent 5 and soon for all rational exponents. He very 
quickly perceived the tremendous significance of this result, and more generally, the 
importance and usefulness of infinite series to analysis. 

In this paper Newton resolved the general problem, at least in principle, of finding 
the area under a curve defined explicitly or implicitly. He showed that by means 
of infinite series the problem could be reduced to that of integrating xn, where m 
and n were integers. If the equation was given explicitly as y = f(x) with f(x) a 
rational or algebraic function, then f (x) could be expanded as an infinite series by the 
binomial theorem. The area under the curve could then be obtained after term-by-term 
integration. Among the examples he gave were the curves” 


1 9x2 — x2 V1 + ax? 


— er oe y = ——<$<——$ 7 2 +a y a 

1+ x? { Aaya By V1 — bx? 

He wrote that the quadrature of the last example yielded the length of an elliptic arc. 

The problem of integrating even these elementary functions would have been 
too difficult for the mathematicians before Newton, but his work on the integration 
of implicitly defined functions took algebra and analysis to a new level. In 1664, 


». 


! Newton (1967-1981) vol. 1, pp. 104-111. 
2 ibid. vol. 2, pp. 215-217. 
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Newton learned from the books of Viéte and Oughtred how to solve algebraic 
equations f(x) = O by the method of successive approximation. One chose an 
approximate solution, and on that basis, one derived successively better ones. The 
Islamic mathematician Jamshid al-Kashi (1380-1429) had used a primitive form of 
this method to solve cubic equations and to compute roots of numbers, that is, to solve 
equations x? — N = O. With the concept of infinite series in hand, and the technical 
skill to work with it, Newton showed how to solve the equation f(x,y) = 0 in the 
form y = g(x), where g(x) was an infinite series.? Here he was consciously working 
with the analogy between decimals and infinite series. In his tract that has come to be 
known as De methodus serierum et fluxionum, he wrote,* 


Since the operations of computing in numbers and with variables are closely similar—indeed there 
appears to be no difference between them except in the characters by which quantities are denoted, 
definitely in the one case, indefinitely in the latter—, ... | am amazed that it has occurred to no 
one (if you except N. Mercator with his quadrature of the hyperbola) to fit the doctrine recently 
established for decimal numbers in similar fashion to variables, especially since the way is then 
open to more striking consequences. For since this doctrine in species has the same relationship 
to Algebra that the doctrine in decimal numbers has to common Arithmetic, its operations of 
Addition, Subtraction, Multiplication, Division, and Root-extraction may easily be learnt from 
the latter’s provided the reader be skilled in each, both Arithmetic and Algebra, and appreciate the 
correspondence between decimal numbers and algebraic terms continued to infinity: namely, that 
to each single place in a decimal sequence decreasing continually to the right there corresponds a 
unique term in a variable array ordered according to the sequence of the dimensions of numerators 
or denominators continued in uniform progression to infinity (as you will see done in the sequel). 
And just as the advantage of decimals consists in this, that when all fractions and roots have been 
reduced to them they take on in a certain measure the nature of integers; so it is the advantage 
of infinite variable-sequences that classes of more complicated terms (such as fractions whose 
denominators are complex quantities, the roots of complex quantities and the roots of affected 
equations) may be reduced to the class of simple ones: that is, to infinite series of fractions having 
simple numerators and denominators and without the all but insuperable encumbrances which 
beset the others. I will first, consequently, show how to reduce other quantities to terms of this 
sort and then I will apply this Analysis to the resolution of problems. 


Newton obtained higher and higher powers of x by successive approximation. He 
then found the area under a curve defined implicitly as f(x, y) = 0 by integrating g(x) 
term by term. He gave this method in the De Analysi, but he realized that one did not 
always obtain the solution y = g(x) as a power series. In a longer treatise on calculus 
and infinite series of 1671, Newton gave examples where the solution around x = 0 
was of the form y = x% g(x), with g(0) ¢ 0 and @ a fraction.* He realized that for 
only certain values of a could g(Q) be determined; for those values, the functions 
y = x*g(x) were solutions of f(x,y) = 0 in the neighborhood of x = 0. He 
devised a method now called Newton’s polygon to determine the allowable values of 
a. This method has important applications in algebraic geometry and analysis. Newton 
extended his method for solving f(x, y) = 0 to obtain the inverses of functions defined 
by infinite series. He knew that his formula, mentioned earlier, for the area of a sector 


3 ibid. p. 221, footnote 59. 
4 ibid. vol. 3, pp. 33 and 35. 
5 ibid. vol. 3, pp. 42-73. 
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of a circle was equivalent to the series for arcsine. By inversion he found the series for 
sine and from that the series for cosine. We have seen that Madhava earlier obtained 
the series for these functions by a different method. 

Newton uncharacteristically wrote up his results on infinite series because by 
1668 others were beginning to make similar discoveries. Mercator published a book, 
Logarithmotechnia in which he expanded i as a series and integrated term by term 
to obtain his series for In(1 + x); in fact, Mercator gave the series for only some small 
values of x. The general series for In (1 + x) in powers of x, or e in Wallis’s notation, 
was given in Wallis’s review of Mercator’s book, contained in the 1668 volume of 
the Philosophical Transactions. In the spring of 1665, Newton had done exactly the 
same thing; after Mercator’s publication, he realized that he would lose credit for his 
discoveries unless he made them known. 

Newton submitted his paper to Isaac Barrow, then Lucasian Professor of Mathe- 
matics at Cambridge, who mentioned it to Collins in a letter of July 20, 1669, with the 
words:® 


A friend of mine here, that hath a very excellent genius to those things, brought me the other day 
some papers, wherein he hath sett downe methods of calculating the dimensions of magnitudes 
like that of Mr Mercator concerning the hyperbola, but very generall; as also of resolving 
aequations; which I suppose will please you; and I shall send you them by the next. 


Barrow wrote Collins again on August 20, 1669:’ 


I am glad my friends paper giveth you so much satisfaction. his name is Mr Newton; a fellow 
of our College, & very young (being but the second yeest [youngest] Master of Arts) but of an 
extraordinary genius & proficiency in these things. you may impart the papers if you please to my 
Ld Brounker [sic]. 


Collins made a complete copy of Newton’s paper and communicated some of 
its results to his correspondents in Britain, France, and Italy. He and Barrow urged 
Newton to publish his paper as an appendix to Barrow’s optical lectures but Newton 
resisted, perhaps because he had a much larger work in mind, finally written in 1671. 
This long but incomplete tract is referred to as De Methodis Serierum et Fluxionum, 
though its original title or whether it even had one is unclear, since the first page of 
the original manuscript is lost and mathematicians of Newton’s own time referred to 
it by various titles. In the De Methodis, Newton showed how to find the derivative by 
implicit differentiation of the equation for the curve f(x,y) = 0. He applied this to 
problems on tangents, normals, and radii of curvature. Conversely, given a fluxional 
(differential) equation, he explained how it could be solved, particularly with infinite 
series. The equations he worked with here were algebraic differential equations. 

It is interesting to note that when Newton wrote up his results on series, realizing 
that others were working on similar problems, this exercise gave him the opportunity 
to rethink his ideas and improve upon them. This happened to him several times. 
For example, in the spring of 1684, David Gregory, nephew of James, published a 


6 Newton (1959-60) vol. 1, pp. 13-14. 
7 ibid. pp. 14-15. 
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fifty-page tract Exercitatio Geometrica de Dimensione Figurarum,® discussing his 
uncle’s results on infinite series related to the binomial theorem. He also promised 
to write a sequel with more results. This immediately spurred Newton to compose the 
Matheseos Universalis Specimina, in the first part of which he gave a brief history of 
his work on series and the results on this topic he had communicated to Collins and to 
Leibniz. He then went on to develop some new ideas on finite differences and series. 
The paper was not completed and in fact ended in the middle of a sentence. Very soon 
after this, he reorganized his ideas and presented them in a paper called De Computo 
Serierum. Here he left out the history but further clarified the new mathematical idea 
on series and differences, framed as the transformation formula now often called 
Euler’s transformation. (See our Sections 10.1 and 10.2.) Unfortunately, Newton 
never published these papers. Similarly, in 1691 he wrote and rewrote the tract De 
Quadratura Curvarum, containing the first explicit statement of Taylor’s theorem; he 
published only a portion of this work some years later. In this connection, see our 
Section 11.2. 


8.2 Algebra of Infinite Series 


Newton pointed out in his De Analysi that just as infinite decimals were needed to 
divide by numbers, extract roots of numbers, and solve equations with numerical 
coefficients, infinite series were needed to divide by polynomials, extract roots of 
algebraic expressions, and solve equations with algebraic coefficients. To illustrate 


fe : : 2 
division, Newton considered the equation y = ia and showed that the process led to 
the series” 
_ a ax Gone, saree 81 
bh. DR be BR oe 


a*x ax? a*x? a*x4 


eae 2 
b 2b? 3b3 4b4+ ) 


if the required area was taken over the interval (0,x). If, however, the required area 


under y = ae was over (x,00), then with a = b = 1, and x replaced by x, he 


started with the series 


Ni Soreg = Sag Pe “ete, (8.3) 
The area was then given by 
ga eer (8.4) 
3 5 7 ; : 


8 Gregory (1684). 
9 Newton (1967-1981) vol. 2, pp. 213-215. 
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Newton noted that x should be small in (8.2), but should be large in (8.4), though 
he did not specify how small or how large. At the end of the paper, he made some 
remarks on convergence. He observed that if x = 5. then x would be half of all of 
x +x2 4x34 x4 etc. and x? half of all of x? + x? +.x*+ + x5 etc. Soifx < 5. then 
x would be more than half of all of x + x? + x? etc. and x? more than half of all of 
x? +x3 + x4 etc. He then extended the argument to the case 7, where b was a constant. 

In his second example, Newton applied the algorithm for finding square roots of 
numbers to ./(a” + x7), obtaining the infinite series 

He ae x 5° Te Diy 


! ! ! tc. 8.5 
$ iéa> W8al” 3560 10%datt ~ 2! 


A little later in the paper, Newton explained his method of successive approxima- 
tions to solve polynomial equations f(x, y) = 0. To illustrate the method, he first took 
an equation with constant coefficients: !° 


y y= 5 =0, (8.6) 
An approximate solution would be 2, so he set y = 2 + p to transform (8.6) to 
p> +6p*+10p—1=0. (8.7) 


He argued that since p was small, the terms p?+6p7 could be neglected, though he 
noted that a better approximation would be obtained if only p* were neglected. Thus, 
he had p = 0.1, and he substituted p = 0.1 + q in (8.7) to obtain 


gq? + 6.3q2 + 11.23q + 0.061 = 0. 


Newton linearized this equation to 11.23g + 0.061 = 0, solved for gq to get 
q = —0.0054, set g = —0.0054+-r, and wrote that one could continue in this manner. 
In a similar way, he resolved the equation!! 


y a’y 2a 4 axy x =0 (8.8) 


for small values of x. He set x = 0 to obtain 
y? + a’y —2a* =0 (8.9) 


so that y = a was a solution; he set y = a + p in (8.8) and took the linear part of the 
equation to get p = — 4x. In this manner, he had the series 


1 x2 131x3 — 509x4 ; (8.10) 
=a x- f f etc. R 
cs 4°" 64a 512a2' 1638403 


10 ibid. pp. 219-221. 
!l ibid. pp. 223-227. 


170 De Analysi per Aequationes Infinitas 


He then used the example 


y? +axy4 xy a —2x7 =0 (8.11) 


to illustrate how to obtain a solution for large values of x. Here he started with the 
highest power terms in the equation (8.10) to get 


yo ig ye = 0; 


Since y = x was a solution of this, he set y = x + p in (8.11) and proceeded as before. 

Finally, Newton showed that similar methods could be applied when the equation 
had an infinite number of terms. The problem of interest was to solve y = f(x), for x 
in terms of y, where f(x) was an infinite series. This gave him a series for the inverse 
function and in the De Analysi, he applied it to the cases where 


I 4 ae 
fQ) Send =.= a 5% + 5% aes 


and where 


1 3 5 
f(x) = arcsinx = x4 ra 0 7" 


Thus, he found the series for the exponential and sine functions. !” 


Observe that Newton’s method of successive approximations for finding the series 
for inverse functions is actually equivalent to the method of undetermined coefficients. 
One assumes that if z denotes the series, say for — In(1 — x), then 


x =aytaiztanz*+-:, 


and the values of ao, a1, a2,... are obtained by substituting back in the series and 
equating the coefficients of the powers of z on both sides of the equation. Newton must 
have understood this because in his October 1676 letter to Oldenburg he wrote,!* 


Let the equation for the area of an hyperbola be proposed 


1 1 1 1 
Z=xt+ 5x + get yet s* etc. 


and its terms being multiplied into themselves, there results 


11 5 
2 = x2 + x + at + ak etc., 
3 7 
3 = x? ae + ae etc., 
ZA = x* + 2x, etc., 
2 = x, etc. 


12 ibid. pp. 235-237. 
13 Newton (1959-1960) vol. 2, p. 146. 
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Now subtract 52° from z, and there remains 
1 1 1 
Zz 2 =x x? 2 x4 a 5, etc 
2 6 24 60 
To this I add bz, and it becomes 
it 1 1 3 
aoe ae + ae =x+ at + ra etc. 
I subtract yet and there remains 
1, 13 l 4 1 5 
= ‘ te 
Zz ZA Sz 74° x 120° etc 
Tadd mw? and it becomes 
oo a go 
as nearly as possible; or 
= , ete. 
X=Z re hee he + 790% etc 
He then went on to state two general theorems: 
Let z=ay+ by? + cy? + dy* + ey+ etc. Then conversely will 
Zz b 4 2b? — ac 3, dabe — 5b? — a2d 4 
y= Zo Zo Zz 
aq a a a’ (8.12) 
3a7c* — 21ab*c + 6a7bd + 14b* — are 5 
+ 5 Zz + etc. 
a 
Let z=ay+ by? + cy? + dy! + eyo+ etc. Then conversely will 
b 3 3b?-ac 5  8abc—a*d—12b> 5 
y= qe + qe 10 yé 
aoa a a (8.13) 
55b4 — 55ab*c + 10a7bd + Sa*c? — ate 
+ B z+. ete. 
a 
Newton observed that if he took a = 1,b = é c= a d= — etc., in the second 


series, then the series for sinz would follow. Newton actually wrote the series for 
r sin z, where r was the radius of the circle, in powers of < because z was the length 
of an arc of the circle. Euler later eliminated the role of the radius and defined the 


trigonometric functions as we do now. 


8.3 Newton’s Polygon 


In his De Methodis, Newton explained his method of solving f(x, y) = 0 by means 
of the Newton polygon, where the solution took the form y = x* y;, with @ rational 
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and y; a power series in x.'+ To find the possible values of a, he plotted the points 
(b,a) for each term cx“y? in f (x,y), such that b ran along the horizontal axis and a 
along the vertical. He then took the lower portion of the convex hull of these points, 
consisting of the straight line(s) joining the vertical to the horizontal axis. Although 
it does not enclose an area, this lower portion is called the Newton polygon and the 
slope(s) of these line(s) gave him the values of a. For example, if m were the slope of 
a line in the polygon, then one value of @ would be given by —1. These values of a 
permitted the evaluation of a nonzero value of y;(0). Note here that the lines joining 
the other pairs of points (b,a) could allow for zero values of y; (0). 
Newton considered the example 


3 
- 

y= Sy y* — Ta?x?y? + 6a>x? + b?x4 = 0, 
a 


where he had the points (6,0), (5,1), (4,3), (2,2), (0,3), and (0,4). The line joining (0,3), 
(2,2), and (6,0) formed the polygon and gave Newton the terms 


y® — Ja2x2y? + 6a3x3; 


setting these equal to zero, he obtained the lowest-order term in the expansion of y as 
a series in x. The slope of the line in the polygon was —} so in the case y = cx?, 
the terms had the same power in x. Newton could then set y = v./ax to reduce the 
equation to v° — 7v* + 6 = 0. He obtained v = +1, +/2, +/—3 but rejected 
the complex roots. Thus, he had four possible initial values of y: +,/ax, +V/2ax. 
He wrote that all four expressions were acceptable initial values for y; by successive 
approximations, he went on to find more terms. 

Newton’s solutions of f(x,y) = 0 are the first known examples of the implicit 
function theorem. Significantly, though Newton did not give an existence proof, he 
presented an algorithm for deriving the solution. S. S. Abhyankar pointed out that this 
algorithm is applicable to existence proofs in analysis and can also produce the formal 


solutions required in algebraic geometry.!> 


8.4 Newton on Differential Equations 


In his 1670-71 treatise De Methodis Serierum et Fluxionum, Newton discussed how 
to find the derivative from the equation f(x,y) = 0 and, conversely, how to find the 
relation between x and y, given a first-order differential equation f(x, y,x,y) = 0. 
Note that in the 1690s, Newton began to use the dot notation to indicate a fluxion, 
or derivative. In his earliest work, including his work in the 1670s, he employed the 
letters p, g, or m, n. 


14 Newton (1967-1981) vol. 3, pp. 49-55. 
!5. Abhyankar (1976) pp. 416-417. 
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To illustrate how series could be used to solve differential equations, Newton 
considered several examples, !® including 


Fa1¢—, 
X a—-x 


and 


9 hae bee pH 2a =0. 


He rewrote the first equation as 


and then showed how to obtain particular solutions of this equation by assuming a 
series solution. He rewrote the second equation as a quadratic in 4 to get 


and solved the quadratic algebraically to obtain 


y 


re ee 
ee ae 2 vane 


1 
After expanding (j + a7)? by the binomial theorem, Newton integrated the 


resulting infinite series term by term. He apparently did not observe that ( i + x2)2 
could be integrated directly in terms of the logarithm. Barrow gave the integral of 
(a2 + x2)2 in his Lectiones Geometricae!'’ of 1670 and Newton knew this work quite 
well. However, unlike Leibniz, Newton may not have been particularly interested in 
closed-form solutions.'® Newton changed the third equation into a cubic in - 


the same cubic as in (8.8). So from (8.10) he saw that 


y ew 131x3 
x 4 64a ' 512a2 


16 Newton (1967-1981) vol. 3, pp. 83-91. 

!7 Barrow (1916) pp. 160-161 and p. 185. 

18 Nevertheless, see Newton (1967-1981) vol. 3, pp. 199-209, where Newton discussed closed forms in terms 
of known curves. 
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and hence 


x? x 131x4* 
y=ax 


8 192a _2048a? 


8.5 Newton’s Earliest Work on Series 


In some of the earliest material recorded in his mathematical notebooks, Newton 
raised the problem of finding x, given sin x, observing that the problem was equivalent 
to finding the area of a segment of a circle. Newton did this work, inspired by Wallis’s 
book, in winter 1664—-65.!9 

In Figure 8.1, let aec be a quarter of the circle of radius one with center p and let 
pq = x. If the angle ape = 0, then x = sin@ and the area of the sector ape would 
be 50 — 5 arcsin x. Newton’s problem was to find an expression for the area given 
by 5 arcsin x, when x was known. He knew, from a study of Wallis’s Arithmetica 
Infinitorum, that the area aeqp was equal to the area under the curve y = V1 — ?? 
over the interval [0, x]. The area of the triangle peg was sav 1 — x2. So his formula 
in modern notation would be given as 


1 x 1 
area of sector aep = : arcsinx = / V1—? dt 5% 1 — x2. (8.14) 
0 


In the course of this work, he discovered the binomial theorem. This gave him the 
result 
2 gt go S58 7? Oia” 


(1—x2 =1-2 ete, 
2 8 16 128 256 1024 


a 


P q c 


Figure 8.1 Newton’s figure for derivation of the arcsine series. 


19 Newton (1967-1981) vol. I, pp. 104-110. 
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He substituted this series in the integral and integrated term by term to get the 
solution of his problem: 
8) 2 hy! 35x 


arcsinx = x 4 etc. (8.15) 
6 40 112) «1152 


In the later De Analysi, Newton found the series for arcsinx by determining the 
arc length of the arc of a circle. In this case, he had to integrate —1—. If this were 


J 1—12- 


combined with (8.14 ), the result would be 


1 dt ; 
— = arcesiIn x. 
0 V1—22 


Thus, Newton was aware of the integral for arcsine as well as the formula 


[vi t2 dt : al 2 sf a 
— =-X xe : 
0 2 2Jo V1 —#2 


Modern textbooks usually derive the last formula by means of integration by parts. 
After 1666, Newton was effectively aware of substitution and integration by parts, 
but to obtain a result more simply, he often gave geometric arguments, similar to the 
preceding one; note, however, that he derived the series for arcsine in 1664-1665. 
Newton discovered another interesting formula from which the series for arcsine can 
be easily derived; the first mention of it occurs in his letter to Oldenburg dated June 13, 
1676, in response to Leibniz: 


If an arc is be taken in a given ratio to another arc, let d be the diameter, x the chord of the given 
arc, and the required arc be to that given arc as n : 1. Then the chord of the arc required will be 


l-n? 4 9-n* 5. 2-n? 4 36-n? 4 49 — n? 


2, 
x+ x + x + x + x + — + x°E+ etc. 
2 x 3d2 4 x 5d2 6 x Td2 8 x 9d2 10 x 11d2 


Here note that when n is an odd number the series is no longer infinite and becomes the same as 
that which results by common algebra for multiplying the given angle by that number n. 


As Newton explained in his letter, A stood for the first term, B for the second 
term, C for the third term, etc. Observe that this formula is equivalent to 


2 =f 2 =), 2 _ 22 
sinnéd =nsin@d a ; ) sin? 6 no ae S ) sind 9 
n(n? — 1)(n? — 3*)(n? —5*) 5 
sin'O@+.--- 


7! 


Note that the series for arcsine is obtained by dividing (8.16) by n and letting n tend 
to zero. The corresponding cosine series is given by 


20 Newton (1959-1960) vol. 2, p. 36. 
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2 vey) 2 2y2 27,2 2 
—2 —2 —4 
cosné = 1 i sin? 6 4 cu ) sin? 9 oh Yin ) sin® 6 powee, 


2! 4! 6! 
(8.17) 


This series does not appear in the extant papers of Newton, although one may safely 
assume he must have known this result. Note that if we subtract 1 from both sides and 
then divide the equation by n”, then we obtain the power series for arcsin” x when n 
tends to zero. The result mentioned by Newton for n odd, as obtainable from common 
algebra, may have been known to him since 1664 when he was a student.”! 


8.6 De Moivre on Newton’s Formula for sin n@ 


In 1698, de Moivre gave a derivation of Newton’s series for sinn@ in “A Method 
of Extracting the Root of an Infinite Equation” in the Philosophical Transactions.” 
Now de Moivre had already seen the method of undetermined coefficients used in 
this context, as presented in the letter of Newton for Leibniz. Note that British 
mathematicians of the 1690s were aware of these letters, since Wallis had published 
portions of them in his 1685 book on algebra* and had presented more complete 
accounts in the 1690s. In his paper, de Moivre considered the situation in which the 
series on the left side of the equation was in terms of a variable different from that on 
the right; one variable had to be determined in terms of the other. He stated the main 
result at the very beginning of his paper: 

Ifaz-+bzz+cez3+dzt+ez> + fz°4 etc. = gy thyytiy t+ky*+ly 
etc., then z will be 


rae co) 


a 
i—2bAB—cA®* , 
F 
a 
k — bBB — 2bAC — 3cAAB—dA* , 
y 


a 
| — 2bBC — 2bAD — 3cABB — 3cAAC — 4dA?B —eA° 
y 


a 


+ etc. 


Note that de Moivre also included the coefficient of y°. Each capital letter denoted 
the coefficient of the preceding term. Thus, A = g and B = Hep AA and so on. His 


proof first assumed that z had a series expansion Ay + Byy + Cy? 4 Dy*+ etc., then 
substituted this for each z on the left side of the initial equation, and then equated 
coefficients of powers of y to get A, B, C, D. To apply de Moivre’s formula to get 


21 See, for example, Newton’s annotations on Viéte in Newton (1967-1981) vol. 1, pp. 78-83. 
22 de Moivre (1698). 
23 Eg. Wallis (1685) pp. 318-319 and pp. 330-346. 
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Newton’s formula, recall that the latter involved the expansion of z = sinn@ in powers 
of y = sin@. Clearly, we can write 


arcsin z = narcsin y. 
When the arcsines are replaced by their power series expansions, we have 


Zz 32° Sz? ny 
z+ fe++=ny4 Pee 
6 40 112 6 40 112 


De Moivre applied his general result to this special equation to obtain 


_ n(n? — 1) a n(n2 — 1)(n? — 32) 


A= . B=0, C= > Pie tak 
. 6 51 


thereby completing a proof of Newton’s formula. 


8.7 Stirling’s Proof of Newton’s Formula 


By studying Stirling’s unpublished notebooks, Ian Tweddle discovered that Stirling 
gave yet another proof of Newton’s formula, by means of differential equations.”+ 
This work was probably done before 1730. Stirling took variables y = rsin@ and 
v =r sinné@, where n was any positive number. He used geometric considerations to 


define these variables and these required that 0 < n@ < 4, but his proof is actually 


2? 
valid for 0 < |6| < 4 with n any real number. Since @ = arcsin (=), 6 = —2— and 
Pay 
similarly n6 = re Stirling obtained the fluxional equation 
y v 
— (8.18) 


rr—yy Jrr — vv’ 


after squaring, he got 


He cross multiplied to obtain 


ners? — yey? = wr? — yew. 


We note that until the middle of the nineteenth century, mathematicians sometimes 
wrote xx and sometimes x. He took the fluxion (derivative) of this equation, assuming 
that y was uniform. This meant that y = 0. Thus, he had the equation 


—n’y* vb = bir? — biy* — yyi?, or 
—n*y*v = br? — by? — yyo. 


24 Tweddle (1988) pp. 67-68. 
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He next set y = 1, without loss of generality, to obtain 
u(r? — y?) — yd +n°v =0. (8.19) 
Assuming a series solution, Stirling set 
v= Ay + By? + Cy? + Dy’ ete. (8.20) 


After substituting (8.20) in (8.19), Stirling found the coefficients, completing the 
derivation: 
_1=-n 9—n? 25 —n? 


he ee RB pe = 
4. 5r2 6-7r2 


B= ’ 
2-3r2 


C, etc. 


In his 1730 Methodus Differentialis, proposition 15, Stirling briefly explained why 
he used a series of the form (8.20) to solve the differential equation. He set v = Ay” 
and substituted this expression in the differential equation to get 


(m2 — m)Ay™~? + (n? — m?)Ay™ = 0. (8.21) 


To obtain the lowest power of y in the series solution, he then set m* — m = 0 to 
obtain m = 0 or m = 1. Thus, the lowest power of y was either 0 or 1. By (8.21), the 
powers had to increase by two, so that either v was given by the series (8.20), or else 


pS AE Bye eye AD yl ees, (8.22) 


By using (8.22) in a similar way, he obtained the formula (8.17) for cos né. 

Newton did not state the cosine formula, but he must have known it from his 
1664 study of Viéte’s booklet on angular sections, written in 1591 but published in 
1615 with proofs supplied by Alexander Anderson, an uncle of the great Scottish 
mathematician James Gregory. In this paper, Viéte expressed in geometric terms 
the formulas for cosn@ and sinn@ in powers of cos@, with n an integer. He 
explicitly pointed out the appearance of the figurate numbers as coefficients of these 
polynomials. As a student, Newton made annotations on this work of Viéte, though 
they do not indicate that he knew his formula (8.16) at that time.?° The manner in 
which he wrote the coefficients of the powers of sin @ in his letter for Leibniz suggests 
that he found the result after his discovery of the binomial theorem. It is even likely 
that he found the formula while reviewing his old notes before writing his first letter 
for Leibniz in June 1676. 

The methods employed by de Moivre and Stirling to prove Newton’s formulas were 
familiar to Newton in 1676. In fact, it is very likely that Newton had already found 
a proof. It is possible that he first came upon the formula for sinn@ by interpolation, 
but he wrote in his letter for Leibniz that he had discarded interpolation as a method 
of proof. Since Newton was very cautious, he must have had an alternative derivation 
when he communicated it to Leibniz, though he gave no hint of what it was. 


25 See Newton (1967-1981) vol. 1, pp. 78-84. 
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In 1812, Gauss applied (8.16) to produce an unusual proof of Euler’s gamma 
function formula C(x) (1 — x) = aw 6 He also briefly mentioned that he could 
prove (8.16) using transformations of hypergeometric functions. Yet another proof 
was given by Cauchy in his Ecole Polytechnique lectures of 1821.77 He also observed 
that the series for arcsinx could be derived by equating the coefficient of n on 
both sides of (8.16). Similarly, by equating the coefficient of n* in (8.17), Cauchy 


obtained 


MD eid 2?" (n!)? 2n42 
5 (sin x) = dX Qa+D!r . (8.23) 
n= 


a result he attributed to J. de Stainville who published it in 1815. In November 1737, 
the particular case of the above series where x = 1 was discovered by Johann 
Bernoulli who communicated it to his former student Euler.*® Euler responded by 
using differential equations to prove the more general formula (8.23).°? As we shall 
see later, Bernoulli’s method can be modified slightly to prove the general case. Even 
before Bernoulli and Euler, Takebe Katahiro published these two series in his 1722 
Yenri Tetsujutsu.>® 


8.8 Zolotarev: Lagrange Inversion with Remainder 


Newton’s statement of his two theorems on the inversion of series suggests that he 
got them by using the method of undetermined coefficients, though his related work 
employs successive approximation. In 1769, Lagrange published a more interesting 
result now referred to as the Lagrange inversion formula. This work was done in 
connection with an application to celestial mechanics. Lagrange’s formula stated that 
if z=a+x(z), then 


2. og 
F(z) = F(a) +.xo(a)F(a) + > = (a) F(a) 
: a 
a a 


3 ! 
+T.3.3 da” (a)F'(a))+---. 


In support of this formula, Lagrange gave a complicated argument using divergent 
series.*! The remainder term for the Lagrange series was apparently first deter- 
mined by Robert Murphy in 1833;°% then in 1861, A. Popoff rediscovered this 


lon 


26 Gauss (1813). 

27 Cauchy et al. (2009) p. 376. 

28 Bu. 4 A-2, p. 186. 

29 ibid. p. 196. 

30 Smith and Mikami (1914) pp. 148-149. 

31 See Johnson (2007) for interesting historical remarks and references. Johnson also fills out the details of 
Lagrange’s sketchy proof from Théorie des fonctions analytique. 

Murphy (1833a). 
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remainder term.** In 1890, the mathematician J. J. Walker wrote a paper, “Of the 
Influence of Applied on the Progress of Pure Mathematics,” and pointed out Mur- 
phy’s priority:34 


The real novelty in Murphy’s memoir was the expression given in the form of a definite integral— 
for the “error” involved in stopping at any given term of Lagrange’s Series. This is quite different 
in form from Popoff’s expression (C.R. [Comptes Rendus] 1861, pp. 795-8), which appears to be 
considered as the first attempt to sum up the remainder. 


In 1876, Zolotarev gave a simple proof of the Lagrange series with remainder: 
k  k-1 


"x 
F(@) = F@) +). ~ 
oo k! dak} 


(¢* (a) F'(a)) 


+ + ie ([ (xo(u) +a — i)" Fw du) 4 


He proved this formula by setting 


Sn = [cow t+a—u)"F'(u) du 


and observing that differentiation with respect to a immediately yielded 


d§, 
“= nSy—1 — x"6" (a) F'(a). 
da 
By setting n = 1,2, ...,n, he arrived at the n relations 


dS 
So = x(a) F(a) + =, 
da 
So 


25, = x262(a) F’(a) + = 
a 


dS. 
NSp—1 = X"$" (a) F'(a) + —. 
da 


Noting that 


% = / GSB EI SE: 


and by substituting the nth equation into the (n — 1)th equation and continuing the 
process, he obtained the required formula. 


33 Popoff (1861). 
34 Walker (1890). 
35 Zolotarev (1876). 
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8.9 Exercises 


(1) Show that Newton’s series in (8.1) and (8.3) can be obtained by repeated 
division. 

(2) Apply the method of finding square roots to the polynomial a? + x” to obtain 
Newton’s series (8.5). 

(3) Carry out Newton’s procedure for successive approximation of a solution of 
(8.8) to obtain the series (8.10). 

(4) Verify Newton’s two theorems on the reversion of series, given in equations 
(8.12) and (8.13). 

(5) Prove that for any real number n and nonnegative integer s, 


()+G) G2) +@) G22) + 


_ WO? — 2)? — 4?) +. — 2s —2)7)_ 
> (2s)! : 


EOE OB 


_ ate? — 14) @* — 3°) -+-@? — Qs — 1)’) 
7 (2s + 1)! ; 


Cauchy used these identities without proof in his Analyse algébrique to prove 
Newton’s formulas (8.16) and (8.17). A proof depends on the Vandermonde 
or Chu—Vandermonde identity; see Chapter 23. 


(6) Show that for any real number n and || < at 


—1 
ROS) cos”? @ sin” @ 
1-2 


n(n — 1)(n — 2)(n — 3) 
[20.324 x 


cosn@ = cos” @ — 


os’—4 6 sint @ — --- 


and 


n(n — 1)(n — 2) 7 


[2.3 os’ 36 sin? O+---. 


F n ReaD anis 
sinnd = i cos 6 sind 


Viéte came close to stating these formulas. Cauchy pointed out in his 


Analyse algébrique that |0| < | was necessary to expand (cos 6 + i sin@)” 


by the binomial theorem when n was not a positive integer. 

(7) Replace cos‘ @ by (1 — sin? 9)? in Exercise 6 and expand by the binomial 
theorem. Then deduce Newton’s formulas (8.16) and (8.17) for |9| < He 

(8) Prove that if z= g(a + x@(z)), then 


n 


ee CE Pi d ld 
Kl dak=1 (° (e(a)-<fleta)) + nl — I,,(a), 


f@ = f(g(a)) + 2 dan 
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g! 
where J, (a) =, (xo(g()) +a — 1)" f'(g(t)) dt. 
a 
See Edwards (1954b) vol. I, pp. 373-374. An equivalent result was published 
by Emory McClintock (1881) pp. 96-97. McClintock (1840-1916), who 
served as president of the American Mathematical Society and was instru- 
mental in the founding of the Bulletin and the Transactions, was an actuary 
by profession. See Johnson (2007) for historical remarks on the Lagrange 
series. 


Show that if g(0) = 0, then 
19) =*(55),.5 (del) ) 
x) =X 
: s@)) ino 2! \dx\e@)) / cao 
ao d? Ne os 
' 31 \dx2\ g(x)] J ,—0 
See Edwards (1954a) p. 459. 


(10) Show that Newton’s differential equation u =1+ Ss can be written in the 
form 


(9 


wm 


(a—x+y)(y'—1)— yy’ =0, 


where y’ = 2 Show that this can be directly integrated. This observation is 
due to Whiteside; see Newton (1967-81) vol. II, p. 101. 
«2 . 
(11) Show that Newton’s second differential equation a = 4 + x* can be 
integrated in closed form in terms of the logarithmic function. 


(12) Show that 


This was proved by Tanzan Shokei in his 1728 Yenri Hakki. 
(13) Show that 


m2 12 12.22 12.22.32 
=1 ! Hath 
9 od S526. 3542 5A 8 
me? ie ae 193 5- 
=| feiss, 
3 4-6 4-6-8-10 4-6-8-10-12-14 


These were presented by Matsunaga Ryohitsu in his Hoyen Sankyo of 1738. 
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(14) Prove that 
a 1 1 3 15 105 


=1 
4 2-3. 8-5 48-7 384-9 3840-11 


This was presented by Hasegawa Ko in his Kyuseki Tsuko of 1844. 
For Exercises 12—14, see Mikami (1974) pp. 213-215. 
(15) Define 


Qn? 17). os 
=i sin d) 


2 
f(m) =1+m(isind) + an sing)? + — 
m?(m? — 27) 


nee 4 
Zi (ising)’ +--- 


where m is real andi = /—1. 
(a) Show that 


fm) f (m2) = f(m, + mz). 


(b) Use the method of Exercise 7 to prove that Newton’s formulas hold for 
|d| < 5 when n is a positive integer. Deduce that f(m) = cosm@ + 
i sinm@ when m is a positive integer. 

(c) Show that ieee) = COS Pe + isin ee when p and q are integers. 

(d) Show that f(m) is continuous and deduce Newton’s formulas for |9| < = 


See Hobson (1957b) pp. 273-277. 


8.10 Notes on the Literature 


The De Analysi was first published by William Jones in 1711, over forty years 
after it was written. An English translation was published in 1745. This translation 
was reprinted in Newton (1964-1967). Whiteside’s English translation is contained 
in Newton (1967-1981), vol. II. Newton wrote this paper so that he should not 
completely lose his priority in the discovery of the methods of infinite series; he 
circulated the manuscript privately to several people who were interested in the 
topic, but he did not want to publish it. For the purpose of publication, he wrote 
a much longer tract, De Methodis Serierum et Fluxionum, in 1671. Due to the 
difficulties of finding a publisher and other concerns, Newton did not complete 
the work or publish it. In 1736 John Colson published an English translation, 
soon retranslated by Castillione into Latin; in 1799 Samuel Horsley published 
the original Latin version. Whiteside remarked that a comparison of these two 
translations provides “an instructive check on the clarity and fluency of Newton’s 
Latin style.” 
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Newton used the material in the De Methodis to construct his two letters to 
Oldenburg for Leibniz in 1676. The second letter was quite long, and in it Newton 
gave a fairly complete account of his work on infinite series. In 1712, he included 
these letters in the Commercium Epistolicum, produced by a Royal Society committee 
headed by Newton to establish conclusively that he was the ‘first inventor’ of the 
calculus. The letters have been republished with English translations in the second 
volume of Newton’s Correspondence, Newton (1959-1960). 

It is remarkable that Newton’s mathematical works were published in their entirety 
only 250 years after his death. Early attempts to accomplish this task were abandoned 
because his papers were in a state of disarray and were stored in several different 
locations. It was even assumed that all of Newton’s significant results were already 
published. Thus, before Whiteside’s monumental work, the world was unaware of a 
number of the results of Newton discussed in this book: transformation of series by 
finite differences, the first clear statement of Taylor’s formula, and the expression of 
an iterated integral as a single integral. D. T. Whiteside (1932-2008) studied French 
and Latin at Bristol University; he was self-taught in mathematics. As a graduate 
student at Cambridge, he became deeply interested in the history of mathematics; 
his doctoral thesis on seventeenth-century mathematics became a classic. In the 
course of his studies, Whiteside asked to see the papers of Newton, still piled in 
boxes, and soon resolved to sort and edit them. Cambridge University Press, the 
world’s oldest continually operating press, chartered by Henry VIII, published the 
eight handsome volumes between 1967 and 1982 from Whiteside’s handwritten 
manuscript and hand-drawn diagrams, with facing pages giving the English and Latin. 
Whiteside’s commentary and notes are extensive and invaluable. Whiteside executed 
this prodigious task in twenty years, with the excellent assistance of his thesis advisor 
Michael Hoskin and Adolf Prag, a teacher at Westminster School. 

For a discussion of Takebe’s work, see Mikami (1974) and Ogawa and Mori- 
moto (2018). The formula for (arcsinx)* was also discovered by Ming An-tu who 
was Manchurian by birth. It appeared in his Ko-yuan Mi-lu Chieh-fa of 1774, some 
years after his death. Ming had not completed the work before he died, and his son 
Hsin finished it. Ming An-tu’s work on infinite series was inspired by the three infinite 
series communicated to Chinese mathematicians by the French Jesuit Pierre Jartoux 
in 1702. These were Newton’s series for sine, cosine and arcsine. 

Pierre Jartoux (1670-1720) was a French Jesuit missionary who entered China in 
1701. He is said to have communicated either three or nine series for trigonometric 
functions to Chinese mathematicians. There is some doubt as to how much informa- 
tion he brought from Europe and how much the Chinese and Japanese mathematicians 
independently discovered. There is no doubt that he communicated the series for 
sine, cosine, and arcsine. But there is some question about the other six formulas, 
one of which is Takebe’s series for (arcsinx)*. Though Jartoux’s original notes are 
lost, Smith and Mikami (1914) suggested that the series for (arcsinx)* was also 
introduced by Jartoux, who had been in correspondence with Leibniz. This appears 
to be unlikely. Jartoux was not a mathematician, and his correspondence with Leibniz 
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was on an astronomical topic. If Jartoux knew the series for (arcsin x)’, he would 
have informed Leibniz, and perhaps others, because this would have been a new 
discovery. In fact, in 1737, when Euler and Bernoulli rediscovered this result and its 
particular case dealing with 2, they regarded their formulas as original. And these 
mathematicians were very well aware of the works of all European mathematicians at 
that time. We may conclude that Takebe was the first to find the series for (arcsin xr 
and the corresponding series for 17, while Ming’s discovery was independent, though 
inspired by a knowledge of the series communicated by Jartoux. 


9 


Finite Differences: Interpolation 
and Quadrature 


9.1 Preliminary Remarks 


The method of interpolation for the construction of tables of trigonometric functions 
has been used for over two thousand years. On this method, one may tabulate the 
values of a function f(x) constructed from first principles (definitions) for x = a and 
x = a-+h, where h is small, and then interpolate the values between a and a + h, 
without further computation from first principles. For sufficiently small h, one may 
approximate the function f(x) by a linear function on the interval [a,a + h]. This 
means that, in order to interpolate the values of the function in this interval, one may 
use the approximation f(a + Ah) ¥ f(a) +AC(f(at+h) -— f@), 0<A < 1. Inhis 
Almagest of around 150 AD, Ptolemy applied linear interpolation to construct a table 
of lengths of chords of a circle as a function of the corresponding arcs. These are the 
oldest trigonometric tables in existence, though Hipparchus may well have constructed 
similar tables almost three centuries earlier.! In Ptolemy’s table, the length of the chord 
was given as 2R sin 6, where R was the radius and 26 was the angle subtended by the 
arc. Later mathematicians in India, on the other hand, tabulated the half chord; when 
divided by the radius, this gives our sine. 

From the beginning of the seventh century, Chinese and Indian mathematicians 
developed second-order interpolation, in which they employed second-degree polyno- 
mials in order to determine values between a and a + h. To do this, they had to take 
second differences of the initially determined values. Let A denote the first difference: 


Af(a) = fath)— f@; 
linear interpolation may then be performed by using 


f(a + ah) & f(a) + r4f (a), 


1 Chabert (1999) pp. 321-328. 
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or 
f(atah) = f(a) +AAf (a). (9.1) 


Around 600 AD, the Chinese astronomer Liu Zhuo used second-order differences 
to calculate the positions of the sun and moon. Suppose f(a), f(a +h), f(a + 2h) 
are the observed values of the positions at equal time intervals h. Setting 


A,=fth)—f@, Ar,=f@+2h)— flatrh), 


for 0 < X < 1, Liu Zhuo’s formula for the interpolated values was 


i a 
f(a+ah) = f(a)4 5 {Al + Ax) + A(Ay — Ad) 5 (A; — Ad). (9.2) 


The second difference appears in the terms containing A; — A2. For j a nonnegative 
integer, let A denote the first difference 


Af (a+ jh) = f(a+ (i+ Dh) — flat jh); 
the second difference would then be given by 
A’ f(a + jh) = A(Af(a+ jh)) 
= Af(at+ (i+ Dh) — Af(a+ jh) 


= f(at+(j+2h) — f(a+Gt+ Dh) 
f(a+ Gt Dh) + f@+ jn). 


Thus one has 


Ar— Ai = f(at+2h)— flath)— fath+f@ 
= A’ f(a). 


It is not difficult to show that Liu Zhuo’s formula (9.2) is equivalent to 


s(s — 1) 


fia+s) = f@-+sAf(@)+ A* f(a), s=ah, O<A<1, (9.3) 
the Gregory—Newton interpolation formula, discussed in Section 9.3. 

The astronomer Yi Xing (683-727) in 727 used an unequal intervals second 
difference formula to deal with the situation in which observations of the positions 
of the sun or moon were carried out at varying intervals.* 


2 Li Yan and Du Shiran (1987) pp. 89-91. 
3 Li Yan and Du Shiran (1987) p. 91. 
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In his 628 work, Dhyanagrahopadesadhyaya, the Indian mathematician and 
astronomer Brahmagupta (598-668) employed the formula,* here converted to 
modern notation: 


h h2 
f(a+aAh) = fi@4 5 (Af (a h) + Af(a)) + 5 (AF@ —Af(a—h)), (94) 


the second-order Newton-Stirling interpolation formula, (9.17), given in Section 9.2. 
In his 665 Khandakhadyaka, Brahmagupta also described a general method of 
interpolation for unequal intervals, reducible to (9.4) when intervals were equal. 

Parabolic or second-degree polynomial methods of interpolation were also 
employed by Islamic mathematicians such as al-Biruni (973-1048). Indian 
astronomers had constructed tables of sine and cosine values, multiplied by a radius. 
Having traveled and spent time in India, al-Biruni was aware of these tables and 
expanded them to include tangent and cotangent functions. 

By the seventeenth century, the requirements of navigation and astronomy 
demanded finer tables of trigonometric and related functions; this led to the invention 
of the logarithm and better interpolation methods. Motivated by the needs of 
navigation, in 1611 or a little earlier, Thomas Harriot wrote a remarkable treatise, De 
Numeris Triangularibus et inde de Progressionibus Arithmeticis: Magisteria Magna, 
considering finite differences of third and higher order.” He gave the fifth-order 
interpolation formula, expressed in modern notation as 


F(x) = f+ @ Af (0) + (3) A? f (0) +--+ (5) A°fO), (9.5) 


where 


k ‘ (9.6) 


@ x(x — 1)\(x —2)---(4 -—k +1) 
were the binomial coefficients and Af(0) = f(1) — f(0), A? f(0) = A(Af(0)) = 
Afd)—-Af(O) = f2)-— fd)—(fd)— f ()), ete. In Harriot’s work, x took rational 
values and he used his formula to interpolate between unit values of the argument. 
He understood the values of (9.6) in terms of figurate numbers, instead of binomial 
coefficients, when x was an integer. 

Unfortunately, Harriot did not publish his work; some of his methods were 
rediscovered soon afterward by Henry Briggs (1561-1631). Briggs was the first 
professor of mathematics at Gresham College, London, as well as the first Savilian 
Professor at Oxford. In his Arithmetica Logarithmica of 1624, Briggs mentioned that 
the nth-order differences of the nth powers of integers were constants. According 
to Whiteside,° this work contained tables of logarithms obtained by second-order 


* See Gupta (1969). 

5 Beery and Stedall (2009). 

© Whiteside (1961b) p. 235. Whiteside remarks that Briggs apparently had some awareness of the later Gauss, 
Bessel, and Stirling interpolation formulas. 
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interpolation, that is, taking the first three terms on the right side of Harriot’s formula 
(9.5). Observe that if the second differences are approximately identical, then the third 
and higher differences are approximately zero and can be neglected. More generally, 
if the nth differences are approximately constant, then f(x) can be approximated 
by the polynomial of degree n obtained by extending Harriot’s formula (9.5) to nth 
differences. 

Briggs also wrote Trigonometria Britannica, a book of trigonometric tables with 
a very long introduction giving details of his methods. Briggs’s friend, Henry 
Gellibrand, had this work published in 1633, after Briggs’s death. Unfortunately, 
the many users of these trigonometric tables may not have read the more important 
introduction in which Briggs gave some very interesting results, including the 
binomial theorem for exponent 5. However, the Scottish mathematician James 
Gregory studied Briggs’s introduction and thereby learned interpolation methods. 
Thus, also making use of advances in algebraic notation, and employing N. Mercator’s 
discovery of infinite series, Gregory obtained interpolation formulas containing up to 
an infinite number of terms. In an important letter to Collins, dated November 23, 
1670, he communicated his formula,® given below in my translation, describing it as 
“both more easie and universal than either Briggs or Mercator’s, and also performed 
without tables.” 


Iremember you did once desire of me my method of finding the proportional parts in tables, which 
is this: In figure 8 of my exercises [Exercitationes Geometricae], on the straight line AJ consider 
any segment Aq, to which there is a perpendicular ay, such that y lies on the curve ABH, the 
rest remaining the same; let there be an infinite series [sequence] 4, 4S, a—2c a—3¢ ete., and 

c’ 2c 3c 4c 
Ls of the first four 
: ie vie ee _ ad, bf , kh | li 
terms o of the first five terms 7? etc., to infinity; the straight line ay = State bet 
etc. to infinity. 


let the product of the first two terms of this series be Be of the first three terms 


Gregory defined d, f, h, i, etc., as the successive differences of the ordinates, at 
equal intervals c. He took f(0) = 0, so thatd = f(c) — fO) = fo, f = 
f (Qc) — 2f(c), etc. After inserting the values of a, b, k, 1,..., Gregory’s formula 
can be written as 


(a — a(a — c)(a — 2c) 


= ¢ | g ¢) 2 | 3 
f@= mak (0) 4 72 A* f (0) 4 603 A” f (0) 
a(a —c)(a— 2c)(a — 3c) 4 ae 
Tacs A‘ f (0) + F (9.7) 


This result is now known as the Gregory—Newton forward difference formula, but it 
may also be called the Harriot—Briggs formula. 

Newton’s interest in finite differences and interpolation appears to have been a 
response to an appeal from one John Smith for help with the construction of tables 


7 ibid. pp. 233-234 and footnote 14 on p. 236. Also see Whiteside (196 1a). 
8 Turnbull (1939) pp. 290-292. 
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of square, cube, and fourth roots of numbers. Collins broadcast this appeal; he wrote 
in a letter of November 23, 1674, to Gregory,’ “We have one Mr. Smith here taking 
pains to afford us tables of the square and cube roots of all numbers from unit to 10000, 
which will much facilitate Cardan’s rules.’ Smith was an accountant and compiler of 
tables whom Newton had helped five years earlier with the making of tables for the 
areas of segments of circles. Newton again assisted Smith, writing to him on May 
8, 1675, giving details for the construction of tables of roots.!? Newton explained to 
Smith that he should tabulate the roots of every hundredth number n. From these, he 
should construct the roots of every tenth number 7 + 10, n + 20,... with a constant 
third difference and thence the roots of n + 1,n + 2,... with a constant second 
difference. Newton also cautioned that all computations should be done to the tenth or 
eleventh decimal place so as to obtain a table accurate to eight places. Newton’s ideas 
on finite difference interpolation developed quite rapidly after this. 

Around October 1676, Newton started composing his (incomplete) “Regula Differ- 
entiarum” in which he presented the Newton-Stirling and Newton—Bessel formulas. !! 
Perhaps he was not quite satisfied with this work; he next penned an untitled work on 
his general interpolation techniques,'? a monograph fairly close to the final version 
that appeared in print in 1711 as the Methodus Differentialis.'+ 

In a draft of his letter dated October 24, 1676,!* intended for Leibniz through 
Oldenburg, secretary of the Royal Society of London, Newton set forth some of 
his insights on interpolation, including statements of his general formula and the 
Newton-Stirling formula. He later eliminated the portion of the letter on interpolation 
because he saw a copy of a letter from Leibniz to Oldenburg,!> dated February 3, 
1673, stating that Leibniz had independently discovered the Harriot—Briggs formula. 
Newton perhaps assumed from this that Leibniz may have made progress parallel to 
his own in the study of finite differences, though this was not the case. 

In his Principia of 1687, in Lemma V of Book III, Newton published his general 
formula now called the divided difference formula, but he did not publish some of its 
corollaries, such as the Newton-Stirling and Newton—Bessel formulas. In 1708, Roger 
Cotes (1682-1716) independently found the latter formula and included both formulas 
in a paper, “Canonotechnia,” published posthumously in his Harmonia Mensurarum 
of 1722.!° In Whiteside’s view!” it might be more appropriate to refer to the Newton— 
Bessel formula as the Newton—Cotes formula, after its two independent creators. 

Newton was the single most significant contributor to the theory of finite difference 
interpolation. Although many formulas in this subject are attributed jointly to Newton 
and some other mathematician, they are actually all due originally to Newton, with the 


Turnbull (1939) p. 291. 

Newton (1959-1960) vol. 1, pp. 342-344. 

Newton (1967-1981) vol. 4, pp. 36-51. 

ibid. pp. 54-69. 

See English translation in Newton (1964-1967) vol. 2, pp. 168-173. 

Newton (1959-1960) vol. 2, pp. 130-161. 

Gerhardt (1899) pp. 74-78. 

Cotes (1722). Also see the paper “De methodo differentiali Newtoniana,” included in the same work. 
Newton (1967-1981) vol. 4, pp. 60-61, footnote 22. 
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exception of the Gregory—Newton formula, due to Harriot and Briggs. The secondary 
mathematicians usually made use of these formulas in their numerical work. As early 
as 1730, Stirling pointed this out!® in his Methodus Differentialis: “After Newton 
several celebrated geometers have dealt with the description of the curve of parabolic 
type [defined by a polynomial] through any number of given points. But all their 
solutions are the same as those which have just been shown; indeed these differ 
scarcely from Newton’s solutions.” It is amusing that Stirling was subsequently 
honored by having his name attached to a formula he explicitly and modestly attributed 
to Newton. In his insightful work on the Principia, Chandrasekhar remarks, “It is a 
strange irony that in giving an account of Newton’s published work of 1711, we have 
to hyphenate his name with Gauss, Stirling, and Bessel!”!° 

Newton’s divided difference formula, in the notation of the French mathematicians 
A. M. Ampre (1775-1836) and A. L. Cauchy (1789-1857), was written as 


fQ)= fa) + @ =m) f 61,22) +: OH) 22) f 149,98) + 


OS hy) 3 = ey) FG hse so En) 
A Oe = ta) ea) SF hy cen (9.8) 
where 
Fee PO FQ) 
x] — x2 
Gio F(x, ---,Xk-1) — f(a, ... Xk) (9.9) 
X{ — Xk 


If we denote the last term in (9.8) by R,(x), and the remaining sum as P,_1(x), 
then P,,1(x) is a polynomial of degree n — 1 equal to f(x;) fori = 1, 2,...,n. Note 
that this is true because R,,(x;) = 0, i = 1,...,n. Thus, P,_1(x) is the interpolating 
polynomial for a function f(x) whose values are known at x1, x2,...,Xn. In the 
1770s, Lagrange and Waring gave a different expression for this polynomial, more 
convenient for many purposes, especially for numerical integration. The Waring— 
Lagrange interpolating polynomial is easy to obtain, yet it is interesting to see different 
proofs presented in the 1820s by Cauchy”° and Jacobi.7! 

James Gregory used interpolating polynomials to approximately evaluate the area 
under a curve. He communicated his quadrature formula to Collins in the letter 
containing his interpolation formula, deriving it by integrating the interpolating 
polynomial, just as Newton did in his Methodus Differentialis. Newton derived his 
three-eighths rule by integrating the third-degree polynomial obtained by taking the 
first four terms of (9.5). He explained:?” 


18 Stirling and Tweddle (2003) p. 122. 
!9 Chandrasekhar (1995) p. 495. 

20 Cauchy (1989) Note V. 

21 Jacobi (1826). 

22 Newton (1964-1967) vol. 2, p. 172. 
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For example: If there are four ordinates at equal intervals, let A be the sum of the first and fourth, 


B the sum of the second and third, and R the interval between the first and fourth; then the central 


ordinate will be 2b A and the area between the first and fourth ordinates will be At3B R. 


In 1707, Cotes, unaware of Newton’s then unpublished work in this area, composed 
a treatise on approximate quadrature. He wrote down formulas for areas when the 
number of ordinates was 3,4,5,...,11. The coefficients became fairly large after six 
ordinates; for example, his formula for eight ordinates was”? 


751A + 3577B + 1323C + 2989D R 
17280 : 


where A was sum of the extreme ordinates, B the sum of the ordinates closest to the 
extremes, C the sum of the next ones, and D the sum of the two in the middle. Cotes’s 
paper, published posthumously in 1722, contained no proofs of his formulas. 

Meanwhile, in 1719, Stirling published a paper in the Philosophical Transactions** 
on the same topic, presenting formulas for approximate areas for only the odd num- 
ber of ordinates 3, 5, 7 and 9. He remarked that the approximations with an odd 
number of ordinates were more accurate than those with an even number. He did not 
prove this, though it is true. For example, it can be demonstrated that if h = x where 
4n is the number of ordinates, then the error will be O(h"*?) for odd n but O(h"+!) 
for even n.?> 

The Newton—Cotes method of numerical integration was used for a century before 
Gauss developed a new approach, including a formula exact for any polynomial of 
degree 2n — 1 or less when n interpolation points were judiciously constructed. 
The Newton—Cotes formulas are exact only for polynomials of degree at most 
n — 1. Gauss’s procedure will be discussed in Chapter 24. A drawback of the 
Newton—Cotes and Gauss formulas was that the coefficients of the ordinates were 
unequal. The Russian mathematician P. L. Chebyshev (1821-1894) observed that 
when the ordinates f(x;) were experimentally obtained, they were liable to errors. 
Assuming that the probability of error in each of the ordinates was the same, the 
linear combination of the ordinates with equal coefficients had the least probable 
error among all the linear combinations with a given fixed sum of coefficients. 
Chebyshev observed that a quadrature formula with equal coefficients might often be 
preferable. Chebyshev studied mathematics at Moscow University from 1837-1841. 
He was interested in building mechanical gadgets and some of his papers deal 
with the mathematics involved with these. Chebyshev was of the view that his job 
as a mathematician was to consider practical problems and to give solutions both 
theoretically satisfying and practically useful. He repeatedly professed this opinion 
in his lectures and advocated it in several papers; his work on numerical integration 
may be seen as an example of this perspective. 


23 Gowing (1983) p. 119. 
24 Stirling (1719). 
25 Milne-Thomson (1981) pp. 166-170. 
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In 1874, Chebyshev wrote a paper on quadrature with equal coefficients,” 
considering formulas of the type 


1 
/ Ff (x)b(x) dx = k(f (x) + f2) + +++ + fOn)), (9.10) 


where @(x) was the weight function and k was the common coefficient of the 
ordinates. He found a method, exact for polynomials of degree less than n, for 
determining the interpolation points x1,x2,...,x, and the constant k, such that they 
depended upon ¢ but not on f. He worked out the details with the weights given as 


o(x) = land g(x) = Tae In particular, he showed that when (x) = 1, then 


k= 2 and x1,x2,...,X, were the roots of that polynomial given by the polynomial 
portion of the expression 


n n n 
gle 232 4524 6.726 


He computed the zeros of these polynomials for n = 2,3,4,5,6,7. Interestingly, 
Chebyshev was inspired to do this work by Hermite’s 1873 Paris lectures?’ on the 
case (x) = u =: We observe that, even before Hermite, Brice Bronwin gave 


V 1-x 
the formula for this case in a paper of 1849 in the Philosophical Magazine.”® In 
the chapter on numerical integration of their Calculus of Observations, Whittaker 
and Robinson discussed Chebyshev’s method and noted that naval architects found 


it useful.2? 


9.2 Newton: Divided Difference Interpolation 


Newton started his work on interpolation in the mid-1770s, but had made sufficient 
progress to make a brief mention of it in his letter for Leibniz, dated October 24, 
1776. While discussing the problem of determining the area under a curve, especially 
when the expression for the curve led to difficult calculations of series, he wrote,?? 


But I make little of this because, when simple series are not manageable enough, I have another 
method not yet communicated by which we have access to our solution at will. Its basis is a 
convenient, rapid and general solution of this problem, Jo draw a geometrical curve which shall 
pass through any number of given points. Euclid showed how to draw a circle through three 
given points. A conic section also can be described through five given points, and a curve of the 
third degree through eight given points; (so that I have it fully in my power to describe all the 
curves of that order, which can be determined by eight points only.) These things are done at once 
geometrically with no calculation intervening. But the above problem is of the second kind, and 
though at first it looks unmanageable, yet the matter turn out otherwise. For it ranks among the 
most beautiful of all that I could wish to solve. 


26 Chebyshev (1899-1907) pp. 165-180. 
27 Hermite (1873) pp. 452-454. 

28 Bronwin (1849). 

29 Whittaker and Robinson (1949) p. 158. 
30 Newton (1959-1960) vol. 2, p. 137. 
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Interestingly, Stirling quoted just this passage at the end of proposition 18 of his 
book. Clearly, Newton was pleased with the result of his researches on interpolation, 
so he did not neglect the chance to include at least one result in the Principia, as 
Lemma V, Book III. Newton gave his method of interpolation by divided differences 
in the Principia without proof; he provided details in his very short Methodus 
Differentialis.*' The first proposition stated that if one started with a polynomial, then 
the divided differences would be polynomials of degree one less:** 


If the abscissa of a curve consist of a given quantity A and an indeterminate quantity x, and 
if the ordinate consist of any number of quantities b,c,d,e,... multiplied respectively into a 
corresponding number of terms of the geometric progression x, x2,x3,x4,... and if ordinates be 
erected at as many points of the abscissa; then the first differences of the ordinates are divisible by 
their intervals; and that the differences of the differences so divided are divisible by the intervals 
between alternate ordinates; and the differences of those differences so divided are divisible by 


the intervals between every third ordinate, and so on indefinitely. 


We describe Newton’s method in the Ampére—Cauchy notation: If f(x) is a polyno- 
mial, then the first divided difference 


A= f (x1) — f (x2) (9.11) 


x1 — X2 
is also a polynomial, as is the second divided difference 


FORA) = £5.43) 


f (1, X2,%3) = (9.12) 
X{ — X3 
and, in general, the so is the nth divided difference, defined inductively by 
f (x12, nat XneXn4+1) _ St (x1, x2, see Xn) = SF (x2, x3, see Xnt1) (9.13) 
X1 — Xn+1 


Newton explicitly worked out all the divided differences for a fourth-degree 
polynomial. In the second proposition, he explained how the original polynomial or 


function could be constructed from the divided differences:*? 
With the same suppositions, and taking the number of terms b,c,d,e, ... to be finite, I assert that 
the last of the quotients will be equal to the last of the terms b,c,d,e,..., and that the remaining 
terms b,c,d,e,... will be yielded by means of the remaining quotients; and that once these are 


determined there will be given the curve of parabolic kind which shall pass through the end-points 
of all the ordinates. 


If this procedure is applied in general, we obtain Newton’s divided difference formula 
(9.8). In the case of fourth differences we have 


31 Newton (1964-1967) vol. 2, pp. 165-173. 
32 ibid. p. 165. 
33, Newton (1967-1981) vol. 7, p. 247. 
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f(x) — f@) 


fx, x1) = ; 
X—X] X— XxX] 


FSD. Si) 


F@.4n4%2) = 
a 


x2 X— XQ” 
Cees FO Axa) fy, 80,43) 
9 1> 2> 3 => 2 
X — X3 X — X3 
(x, X1,X2,X3 (x1, X2,%3, x4 
F (KR igX2 13,.X4) = ui )_f ) (9.14) 
x —X4 x—X4 


Thus, in each step, the values from the previous equation are substituted for the 
terms on the right-hand side and the resulting equation is multiplied by (x — x1)(x — 
x2)(x — x3)(x — x4), yielding 


F&) = fr) + & = x1) f 1,42) + — x1) (% — X2) f (1, x2, ¥3) 
(x — x1)(% — x2)(x — x3) f (x1, x2, 03, X4) 


(x — x1)(% — x2)(x — x3) (x — x4) f (X,%1, 42,%3,X4). (9.15) 
As we discuss in Section 9.4, the nth divided difference f(x0,x1,...,Xn) 1S 
symmetric in the variables x0,x1,...,Xn. Note also that if the points x9 < x1 < 
x2 < +++ < Xp» are equidistant, with the distance between each being h = x; — xi-1 
fori = 1,2,...,n, then we can prove by induction that 
1 n 
fo, ¥1,--+.%n) = Sa A" fo), (9.16) 
nth" 


where 
A! f (xo) = A(A‘! f(x0)) = Ai! Fro + A) — AM! f (x0). 


For n = | the result holds true: 


fo+h)— flo) 1 


f(x0,x1) = ; = Ap Gs). 
Next suppose that 
7 My Seng SS ae A"! £(x9). 
Then we have 
FACS ROC y a f (41,2, «++ %n) — f (0, ¥1, «++ Xn—-1) 
nh 
= ala ren +h) — A"! f(x0)) 
1 
= Type A’ £0), 


from which the result is immediate. 
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In the third proposition, Newton derived his central difference formulas for the 
case where the points were equidistant. When the number of interpolating points was 
odd, he presented what is now known as the Newton-Stirling formula and, for the 
even case, the so-called Newton—Bessel formula. He did not write down details of the 
derivation, but it is most likely that he obtained it from his general divided difference 
formula, employed in modern textbooks. 

The Newton-Stirling formula** was actually discovered by Newton, but it came 
to be widely known through Stirling, who stated it and worked out examples in his 
Methodus Differentialis. To state the formula: suppose that f(t) is given for the values 


...,X9 — 2h,x9 — h,x0,x9 +h,xo + 2h,...3 


then 
A A —h A 
f (xo + xh) = f(xo) +x cee se | 7 A? f (x9 — h) 
_ x(x? — 1°) A3 f (xo — h) + A3 F(X — 2h) 
3! 2 
22 42 
Ee “oe *) a4 F (xo — 2h) 
x(x? — 12)(x? — 27) AP f (xo — 2h) + A? f (xo — 3A) 
5! p 
22 _ 12\7¥2 _ 52 
a = SD GR ie hy Pek (9.17) 


To prove (9.17), we apply (9.8) with h > O and 


x =yot+ yh, x; = yo, x2 = yo th, x3 = yo —A, 
x4 = yo + 2h, x5 = yo—2h,.... 


Then 
4M Shy, 2S hy Hh Hh 1), 
x—x3=h(y+1), x-x4=h(y—2),..., 
and applying these values in (9.8) produces 


f (yo + yh) 
= f (yo) thy f (v0, yo +h) +h? y(y — 1) f (v0, 0 +h, yo — A) 
he y(y —1)(y +1) fo, v0 +4, Yo — A, yo + 2A) 


h*y(y — DQ + DO —2) fo. yo +h, Yo — h, yo + 2h, yo — 2h) +++. 
(9.18) 


34 Newton (1964-1967) vol. 2, pp. 167-168. Stirling and Tweddle (2003) pp. 119-120. Whittaker and 
Robinson (1959) p. 38. 
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Now from the symmetry of the divided differences and (9.16), we obtain 


f (vo. Yo + A) 
1 
=p Af (0); 
f (vo. Yo +h, yo — h) 
= f Qo = h, yo, Yo +h) 
~ at pA A? fOo- 


Ff Oo; yo + 4, Yo — A, yo + 2h) 
= f(yo —h, yo, yo +h, y + 2h) 


1 
= 373 S00 -), 


f Qo. ¥o +h, yo — h, yo + 2h, yo — 2h) 
= f(yo — 2h, vo — h, yo, Yo +h, yo + 2h) 


A* f (yo — 2A), 


1 
~ 4tht 
(9.19) 


When the values in (9.19) are substituted in (9.18), the result is the Newton—Gauss 
interpolation formula: 


1) 
f6o9= $60) A foo) + 22>. A f200 —h) 
1 
ik x0 wey ere 
+ J 1 2 
| (y+ DyQ- DY YN Ge Diiiedds (9.20) 


4] 


We can rearrange the terms of (9.20) as follows: 


1 y? 
f(vo + yh) = f (v0) ty (0 f(v0) — = 5 AF 00 - m) +? ae A? f (yo — A) 


2. 12 
+ 1 (0°00 Hes, A YOy= 21)) 


yee 
po ) 


a PRES ig = 2 ee O21) 


Now observe that if the values 


A? f (yo —h) = A Ff) — A FQ — hh), 
A‘ f (yo — 2h) = A? f (vo — h) — A? f (0 — 2A), +: 


are substituted in (9.21), the result is the Newton-Stirling formula (9.17). 
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The Newton-Bessel formula,*> due to Newton but extensively applied and worked 
with by Bessel, may be given as 


1 
f (xo + sh+xh) 


1 
= 5 (/ Go) + fo +h)) + xAf (xo) 


Cee 1 2 2 
2_1 
+E? AS feo—m) 
21) (2-2 
( M d 5(A4 Flxo— 2h) + A4 Fla —W)) +> (9.22) 


Stirling gave applications of both formulas (9.17) and (9.22) in proposition 20 
of his Methodus. In the second example in proposition 21, he applied the Newton— 


Bessel formula to show that I (3) — a In fact, he obtained the approximation 


0.8862269251 and then noted that this was nae We remark that Stirling made this 
observation before Euler’s paper on the gamma function had appeared.*° These 
matters are further discussed in Section 17.2. 

The reader might wish to read Chandrasekhar,*” who shows in detail that the 
Newton-Stirling and Newton—Bessel interpolation formulas given in this chapter are 
identical with the original statements given by Newton in his Methodus Differentialis 
of 1711. 


9.3 Gregory—Newton Interpolation Formula 


The Gregory—Newton formula (9.7), or the Harriot—Briggs formula, is important not 
only in numerical analysis but also in the study of sequences whose nth differences, 
for some n, are constant. Apart from interpolation theory, these sequences are now 
studied as a part of combinatorial analysis, but in the seventeenth and early eighteenth 
centuries they arose in elementary number theory and in probability theory. It is 
therefore interesting to consider the methods by which mathematicians of that period 
proved this formula. Unfortunately, Gregory did not leave us a proof. It is possible 
that he had the simple inductive argument given by Stirling in proposition 19 of 
his Methodus Differentialis.*® Stirling assumed that there existed some unknown 
coefficients, A, B,C, D, ... such that 


35 Newton (1964-1967) vol. 2, p. 168. Stirling and Tweddle (2003) pp. 120-121. Whittaker and Robinson 
(1959) pp. 39-40. 

36 Stirling and Tweddle (2003) p. 127. 

37 Chandrasekhar (1995) pp. 495-498. 

38 Stirling and Tweddle (2003) pp. 112-114. 
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aes opel). pee ae =e) 
f@=A+Bz4+C 1-2 + D 1-2-3 T 


Clearly A = f (0). Moreover, 


Af() = f+) —- fi 
=Ay cask aD | paze— DE -2) Lees 
1-2 1-2-3 


Observing that for n = 2, 3, 4, ... 


hg eae Ste Ly 2 Bg ete eee) 
[2 2een — Ldn =1) , 


A (9.23) 


and that Az = (¢+ 1) — z = 1, he obtained B = Af (0). Continuing this process, 
he got C = A? f(0),D = A? f(0),..., completing the proof. Note that Gregory’s 
a 


version of the formula, given by (9.7), would be obtained by taking A = 0 andz = ©. 
Note that we can write the Gregory—Newton formula as 


fO=fO+D (=) A" f(0), (9.24) 
n=1 


where (;) — 2G=1)@=nt)) 
n n!} ; 


However, the seventeenth- and eighteenth-century mathematicians, including New- 
ton and Stirling, did not write down the general expression for A” f (0). They did not 
have a notation for the general expression, but they gave the values of the expression 
for small n: 


AfO) = f)- Ff; A’? FO) = f2—-2F0)+ FO; 
A? (0) = f(3) —3f(2)+3f() — fO),.... (9.25) 


Observe that the absolute values of the coefficients on the right-hand sides of the 
expressions in (9.25) are 1,1; 1,2,1; 1,3,3,1;.... We recognize these as the binomial 
coefficients; we can thus write the general formula as 


A" £0) = (-" Yo(-pk O) fo. (9.26) 
k=0 


9.4 Waring, Lagrange: Interpolation Formula 


Edward Waring and Joseph Lagrange independently but nearly simultaneously took up 
the interpolation problem of finding the polynomial of degree n — 1, taking prescribed 
values at n given points: yj, y2,...,Y, at x1, X2, ...,X,. Their result is usually 
called Lagrange’s interpolation formula, but it might be more accurate to call it the 
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Waring—Lagrange interpolation formula. Of course, this result may readily be derived 
by writing the Newton divided differences in symmetric form, but Lagrange and 
Waring gave the solution in a convenient and useful form. In fact, Waring remarked 
in his 1779 paper on the topic*? that he could state and prove the result without any 
“recourse to finding the successive differences.” We state Waring’s theorem in modern 
notation: Let y be a polynomial of degree n — 1 and let the values of y at x1, x2, ....Xn 
be given by y1, yo, ..., Yn. Then 
_ (x2) (% — 93) +++ @ = Xn) _ = x1) = x3) +++ & = Xn) 
Ge 32) Gr 33) On a) On 1) 2 33) G2 a) 
(x — x1)(% — x2) +++ (% — Xn-1) 


foveal is 9.27 
(Xn — X1) (Xn — x2) +++ (Xn ar 


Waring’s proof consisted in the observation that, when x = xj, the first term 
on the right was y1, while, because of the factor x — x1, all the other terms were 
zero. Continuing this argument, taking successive values of x, Waring completed his 
proof. Lagrange published this result in 1795 in the last three pages of his Lecons 
élémentaires sur les mathématiques donnés a l’Ecole Normale.*° Observe that (9.27) 
can be written in a compact form: Let 


f(x) = & — x1) — x2) +++ — Xn) 
= (x — x1 )g(x). (9.28) 


Now taking the derivative of (9.28) gives us 


f°) = g(x) + & — x1) 8" (x). 


Thus 


f' (x1) = g(%1) = (x1 — x2) — 13) ++ 1 — Xn), (9.29) 


and so the denominator of the first term on the right-hand side of (9.27) is f’(x1). 
Similarly, the denominator of the second term is f’(x2), and so on. Thus, if y(x) is a 
polynomial of degree n — | and y(x;) = yj, then (9.27) can be written as 


n 


f(a) 
= oY} 9.30 
yx) dX Faye 2a)! (9.30) 


In general, taking (9.27) as an interpolation formula, its right-hand side is an 
approximation for its left-hand side, provided y(x) is a not polynomial of degree 
< n— 1. We note that y(x) is useful even as an identity when it is a polynomial 
of degree < n — 1. We mention a corollary of (9.30) used by I. J. Good to find the 


39 Waring (1779). 
40 Lagrange (1867-1892) vol. 7, pp. 183-287, especially pp. 285-287. 


9.5 Euler on Interpolation 201 


constant term in a product conjectured by Dyson. Good took y(x) = 1 in (9.30) and 
then set x = 0 to find*! 


I 
me I] poy oh (9.31) 


we present Good’s application of (9.31) in Section 17.13, and an application of (9.30) 
in Section 14.6. 

To see how the Waring—Lagrange formula follows when Newton’s nth divided 
difference (9.13) is written in symmetric form, first observe that the second difference 
can be expanded as 


y(x,x1) yx, x2) 


y(x,X1,x2) = 
x — x2 x — x2 
= y(x) — yx) y(x1) — y(x2) 
(x — x1)(X — x2) (X41 — x2) (X — x2) 
> y(x) y(x1) y(x2) 632) 
(x—x1)(*%— x2) (LL — x2)(e— x1) (X22 — x1) (x — x2) 
We can then employ (9.32) to derive 
ee ee y(x) y(x1) 
ae ian (x — x1 )(% — x2)(X — x3) (HX — x1) (1 — X2)(X1 — x3) 
y(x2) y(x3) (9.33) 


(x — x2)(x2 — x1)(%2 —X3) — (X — 3) (43 — X2) (43 — 1)’ 


from which we see that y(x, x1, x2,x3) is symmetric in the four variables. We can, in 
fact, prove inductively that y(xo0, x1, ...,Xn) iS Symmetric in the n + 1 variables. Next, 
using an inductive argument, we can show that, with f(x) given by (9.28), 


ya) ox yj) 
eh Bal at . 9.34 
y(x,X1, x2 Xn) f(x) dX (x — xs) f'(;) oa 


an equation that implies the Waring—Lagrange formula (9.30), because for y(x), a 
polynomial of degree at most n — 1, the nth difference y(x,x1,x2,...,X,) iS zero. 


9.5 Euler on Interpolation 


In a paper that he presented to the Petersburg Academy in 1772,* Euler obtained 
an interpolation formula similar to Newton’s. Euler entitled his paper, “De eximio 
usu methodi interpolationum in serierum doctrina,” or “The extraordinary use of the 


41 Good (1970). 
42 Bu. 1-15 pp. 435-497. E555. 
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method of interpolation in the doctrine of series.” Euler considered an odd function 
f(x) that at x = a},a2,a3,a4,... takes the values p1, p2, p3, p4,... respectively. In 
section 2 of his paper, he offered the formula 


F(x) = Ayx + Aox(x? — a?) + Agx(x? — a?2)(x? — a3) 


+ Agx(x? — a?) +++ (x? — af) ++ (9.35) 

where 
A= 

a\ 

P\ P2 

A2 = 

ay (as — at) an (as — a?) 
a P\ P2 P3 Ste 


| | 
= T T , 
ay (ay — as) (az — a3) ar(as — az) (ay — a3) a3 (az — a?) (az — as) 


At the end of section 6 of this paper, Euler added the remark that (9.35) was not the 
most general solution of the problem. Setting 


x? —@ 
Q=x||—*. 
n 


a 


he noted that a function of Q when added to (9.35) would give the general solution. 


9.6 Cauchy, Jacobi: Waring—Lagrange Interpolation Formula 


The Waring—Lagrange interpolation formula is easy to prove, as Waring’s demon- 
stration shows. It is nevertheless interesting to consider other proofs such as those 
of Cauchy and Jacobi. Cauchy’s argument, presented in his 1821 Analyse algébrique 
in the chapter on symmetric and alternating functions, was based on an interesting 
evaluation of the so-called Vandermonde determinant, without using modern notation 
for determinants. Lagrange had used this evaluation in a different context almost fifty 
years earlier. Cauchy was an expert on determinants, a term he borrowed from Gauss. 
He wrote an important 1812 paper on this topic, in which he also proved results on 
permutation groups and alternating functions. In his book, Cauchy considered the 


system of linear equations 
lx tadxy +--+ +04 _jxn-1 = kj, (9.36) 
where j = 0,1,...,2 — 1. We have used subscripts more freely than Cauchy; he set 


f (a) = (@ — a1) (@ — 2) ++ — Oty) ="! + Ay_pau™ +--+» + Ara + Ao, 


43 Cauchy (1989) Note V. 
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so that 
a! + Appa? +--+ + Aja; + Ao = 0, fori = 1,2,...,2-1. 


Cauchy multiplied the first equation of the system (9.36) by Ao; the second, when 
j = 1, by Aj; ...; and the last, when 7 = n — 1, by | and then added to get 


(Ap + Aya +++» +a" !)x = koAg + ki At ++ +++ kn—-2An—2 + kn-1 


or 


cae kn-1 + An—2kn—2 Se Aoko (9.37) 
f(a) 

He derived the values of x1,x2,...,%,—-1 in a similar way. Cauchy applied this 
result to obtain the Lagrange interpolation polynomial. He supposed uo, 1, ...,Un—1 
to be values of some function at the numbers xo, x1, ...,Xn—1. It was required to find 
a polynomial of degree n — 1 


u=agtax-4 ax” Pret sf wie, 
such that its values were uo, U1, ...,Un—1 at X0,X1, ...,Xn—1, respectively. Then 
uj =agt+aix;4 anXx; po. Gi, (9.38) 
where j = 0,1,...,” — 1. Note that the coefficient matrix of the system (9.38) is the 
transpose of the coefficient matrix of the system (9.36). So Cauchy multiplied these n 
equation by unknowns Xo, X1,...,Xn—1 and subtracted their sum from the equation 
for u to get 
u — Xoug — X1uy — XQu2 — +++ — Xn—1Un-1 
= (b= Xp = ApH Ap et — Ky) 00 
(x — xoX0 — x1X1 — +++ — Xp-1Xn-1) a1 
2 2 2 2 
(x XoXo — Xp Xi — +++ — XG Xn-1)a2+-:: 
ey eae Ceee — xP os At da (9.39) 
To determine Xo, X1,...,Xn—1 so that 
u = Xoug t+ Xyuy+---+ Xn-1Un-1, 
he set equal to zero all the coefficients of ao,a1,...,@,—1 on the right-hand side of 


(9.39). Thus, he had the system of equations 
xt xX PRs ie jacket a 
0x0 +x] X1 + +5 )Xn-1 =X", 


with 7 = 0,1,...,2 — 1. He could solve for Xo, X1,..., using the procedure for 
solving (9.36), to find 
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_— FO) — &— x1) — x2) +--+ & — Xn-1) 
fo) Go — #1) 0 — x2) +++ 0 — Fn-1)" 
a= (x — X0)(X — X2) +++ (% — Xn-1) . 

(x1 — x0) (X%1 — X2) +++ (1 — Xn-1) 


(9.40) 


Xo 


and so on, yielding him the Waring—Lagrange interpolation formula. 
Jacobi’s method employed partial fractions; he presented it in his doctoral disserta- 
tion on this topic as well as in his 1826 paper on Gauss quadrature.** He let 


8(X) = (& — x0) — x1) +++ & — Xn-1) 
and u(x) be the polynomials of degree n — 1 whose values were 
Uug,..-,Uyn—1 at X0O,..-,Xn—-1,; 


respectively. Then by a partial fractions expansion he got 


B B By- 
u(x) _ Bo | MS: oop ly enllic (9.41) 
g(x) x-xX9 xXx] Xx —Xn-1 
by setting x = x;, he obtained 
Uj 
Bee (9.42) 


(xj — x0) +++ (xj — xj—-1) (Xj — Xj41) +++ OF — Xn-1) 


Jacobi arrived at Lagrange’s formula by multiplying across by g(x). We note that 
Jacobi’s dissertation also discussed the case in which some of the x; were repeated. 


9.7 Newton on Approximate Quadrature 


The Methodus Differentialis stated the three-eighths rule for finding the approximate 
area under a curve when four values of the function were known; one proposition 
suggests that Newton most probably derived the formula by integrating the interpo- 
lating cubic for the four points. However, in October 1695, he wrote a very short 
manuscript,*> though he left it incomplete, presenting his derivation of some rules 
for approximate quadrature. Surprisingly, he did not obtain these rules by integrating 
the interpolating polynomials but by means of heuristic and somewhat geometric 
reasoning. Since interpolation calculations tend to become very unwieldy, perhaps 
Newton sought a short cut, though it is not clear what stimulated him to write 
this short note. Whiteside conjectured that Newton may have been working on his 
contemporaneous amplified lunar theory where he used some of the results. 

Newton wrote his results consecutively for two, three, four, ... ordinates. In 
Figure 9.1, he took equally spaced points A, B,C, D,... on the abscissa (x-axis) and 
points K,L,M,N,... onthe curve (y = f(x)) such that AK, BL,CM,DN,... were 


44 Jacobi (1926). 
43 Newton (1967-1981) vol. 7, pp. 690-699. 
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A B Cc D E F G H I 


Figure 9.1 Newton’s approximate quadrature. 


the ordinates, or y values of the corresponding points on the curve. For two points, he 
gave the trapezoidal rule labeled as Case 1. 


If there be given two ordinates AK and BL, 
make the area (AK LB) = 5(AK + BL)AB. 


He next obtained Simpson’s rule, published by Thomas Simpson in his Mathe- 
matical Dissertations of 1743;4© Simpson gave an interesting geometric proof. We 
note that since Simpson’s books were quite popular, his name got attached to the 
rule. In 1639, Cavalieri gave particular cases of this rule to determine the volume of 
a symmetrical wine cask. In 1668, in his Exercitationes Geometricae, Gregory too 
presented this rule to approximate Fe tan x dx.*”7 Newton derived Simpson’s and the 


three-eights rule as Cases 2 and 3, where the box notation denotes area:** 


Case 2. If there be given three AK, BL and CM, say that 


1 
gan +CM)AC =U(AM), and again, by Case 1, 


1/1 1 
| (sax + BL) + 5(BL+ cm) AC 


1 
= 7 (AK + 2BL + CM)AC = (AM), 


and that the error in the former solution is to the error in the latter as AC? to AB? or 4 to 1, and 
hence the difference q(AK — 2BL +CM)AC of the solutions is to the error in the latter as 3 
to 1, and the error in the latter will be 


1 
(AK — 2BL + CM)AC. 


Take away this error and the latter solution will come to be 


1 
ge +4BL+CM)AC = L(AM), | the solution required. 


46 Simpson (1743) pp. 109-110. 

a Gregory (1668) pp. 25-27. For more information, see Newton (1967-1981) vol. 7, p. 692, footnote 8 and 
Whiteside (1961b) pp. 248-249. 

48 Newton (1967-1981) vol. 7, pp. 691-693. 
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Case 3. If there be given four ordinates AK, BL,CM and DN, say that 5(AK + DN)AD = 
(AN); likewise, that 


1/1 1 1 
5 (Sux + BL) + 5(BL+CM) + 5(CM + Dv) AD, 


that is, g(AK +2BL+2CM + DN)AD = LAN). The errors in the solutions will be as A D2 


to AB? or 9 to 1, and hence the difference in the errors (which is the difference k (ZAK —2BL— 
2CM +2DN)AD in the solutions) will be to the error in the latter as 8 to 1. Take away this error 
and the latter will remain as 


1 
g (AK + 3BL +3CM + DN)AD = DIVAN). 


We observe that in these three cases and others, Newton assumed without justifica- 
tion that when n + 1 equidistant ordinates were given, the corresponding ratio of the 
errors in using the trapezoidal rule would be n? : 1. Newton went on to consider cases 
with five, seven, and nine ordinates, but his results in the last two cases were not the 
same as the ones obtained by integrating the interpolating polynomials. 

To describe Newton’s proof of Simpson’s formula in somewhat more analytic 
terms, let [a,b] in Case 2 be the interval with b = a + 2h, and let y = f(x) be 
the function on that interval. By the trapezoidal rule, 


b 1 
i f(x)dx © 5 F@) + f(b))@h) = Nh. 


If this rule is applied to each of the intervals [a,a + h],[a + h,b], then 


b 1/1 1 
/ fends * 5 (540@ + f(a+h))4 5 (fla h) 4 (6) (2h) 


1 
= zF@ + 2f(a+h)+ f(b))(2h) = h. 


Let the errors in the two formulas be e; and e2, so that 


b 
/ f(x)dx =ht+ep=ht+e. 
a 


Newton assumed without proof that a = 4. Hence, 


Bets G+» 
e2 e2 


so that 


1 1 
= 4a hy 79 F@) —2F a +h) + fO))2h. 


When this value of e2 is added to /7, we get Simpson’s approximation 


1 
g(f@ t4f(a+h)+ f(b)). (9.43) 
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9.8 Hermite: Approximate Integration 


The formulas of Newton, Cotes, and Stirling for numerical integration were used 
without change for a century. In the nineteenth century, mathematicians began to 
present new methods, starting with Gauss, whose work in this area is discussed in our 
treatment of orthogonal polynomials. Charles Hermite (1822-1901) was a professor at 
the Ecole Polytechnique. He gave a series of analysis lectures in 1873; these and other 
such lectures were published and serve as a valuable resource even today. For example, 
Hermite discussed an original method*? for the numerical evaluation of integrals of 
the form 


+1 
/ ORs, (9.44) 


-1 Vi axe 


where (x) was an analytic function. He started with the nth-degree polynomial F(x) 
defined by 


F(x) = cosn(arccos x). (9.45) 


By taking the derivative, he obtained 


1 JV1— F2(x) 
=n : 
V1 — x2 V1 — x? 


F'(x) = nsinn(arccos x) 


Hence 


1 Fi’) (: 1 ye 
x?7—] nF (x) ie 
= IS a Ce ee eee 
~ nF (x) ( "2 F2(x) ° 2-4 FA(x) | -). 
Hermite observed that the last expression without the first term could be written in 
decreasing powers of x in the form 


to, At, Ad, 
antl © y2n4+2 " y2n43 | 


Consequently, 


1 F'(x) Ao AY A2 


a \ ! | | 
_ T T T T 
yaa | nF (x) x2ntl x2nt2 x2n+3 


At this point, Hermite invoked the integral formula 


a dz XT 
ie 9.46 
L Ge a Soe 


49 Hermite (1873) pp. 452-454. 
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to obtain 
1 i dz = F'(x) i oe 
CJA G@=2Wlee NEG) xe” ee 
asd n 1 _ Ao, AD, 
ae Ser T y2n+l T xont2 fi eee, 
j=l 
where aj, d2,...,@, were the n roots of F(x) = 0. An application of the geometric 


ne zr . 
zag = Lo gat gave him 


Lf” (142453 dz 
x Jy x x2 | 1-2 


n 2 
SE eS eet |e ee 
n i x x2 xn x2n+l 
j= 


series 


Equating the coefficients of 4 on both sides yielded 


Lift. 2! Ley 
dz= when | < 2n, 
n 
=— Soa" +s when /=2n+5, 5 > 0. 
fet 
So, Hermite wrote $(z) = ko + kiz + kag? +++» +kpz" +--+ in order to obtain 
the formula 
1 [ (2) 1X 
dz= d(a;) +R, (9.47) 
is _ z n 2 J 


where 
R = dAokon + Atkangz1 ++. 


Hermite also noted that since the roots a; of F(x) = 0 were given by 


2j-1 ee 
aj; = cos xz), j=1,2,...,n, 
‘i 2n 7 


he obtained 


n)) +R. (9.48) 


i (cos 0) dé = - (cos Ge : 
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Observe that the expression for the error R shows that it must be zero for any 
polynomial @ of degree less than 2n. Hermite may have been unaware that in 
1849, Brice Bronwin derived formula (9.48) by a different method, but without the 
error term.>? 


9.9 Chebyshev on Numerical Integration 


A nice feature of the Bronwin-Hermite formula is that it allows us to find an 
approximate value of the integral by simply adding the values of the function (x) 
at the zeros of F(x) and then multiplying by 7. Chebyshev’s interest in applications 
led him to seek similar formulas for other weight functions. Thus, the purpose of 
Chebyshev’s 1874 paper?! was to find a constant k, and numbers x1,x2,...,X, such 
that yee F (x) (x) dx could be approximated by k(¢(x1) + (x2) +---+@(xy)). Note 
that ¢(x) was the function to be integrated with respect to the weight function F(x). 
In general, Chebyshev required that the approximation be exact for polynomials of 
degree at most n — 1, so he looked for a formula of the form 


+1 F 
/ EO) ps 
|) i H 
= k (b(x1) + (a2) $+ FO) + 1G FPO) + GPO) +--+, 0.49) 
n 1 2 ’ * 
where ge” denoted the mth derivative of @ and k1,k2,... were constants. Following 


Hermite, he considered the case @(x) = + to obtain 


T T 
ZX] Z—Xn 


[ F(x) ( 1 1 ) atk | @+2)!ko 


| ar aa oy gnt2 znt+3 
(9.50) 


He set f(z) = (z — x1) (z — x2) -- + (@ — X»), So that after multiplying by z, the last 
relation became 


+1 
or 


-1 Z-%Xx 


1-2-3---(n+2)kg 
znt2 


L@ 1-2-3---Mm+Dk, 
f(z) zntl 


He let z — oo to get 


ay ae (9.51) 


+1 1 ti 
/ F(x) dx =nk, ork = — : F(x) dx. (9.52) 
1 nN J-\ 


50 Bronwin (1849). 
5! Chebyshev (1899-1907) vol. 2, pp. 165-180. 
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He thus had the value of k, and it remained for him to find the polynomial f (z) 
whose zeros would be the numbers x1,x2,...,x,. For this purpose, he integrated 
equation (9.50) with respect to z to obtain 


f(z) alk) (nt+I)!tko 
C gut znt2 8 


1 
/ F(x) Ing —x)dx =kIn 
-1 


where C was a constant. Hence, by exponentiation he could write 


=ntky (n+) a 
f (z)ekrtl er = C exp cf F(x) In —x)dx }. 
-1 


Chebyshev then noted that the exponential on the left differed from 1 by a series of 
powers of z less than z~”; hence, he noted that f(z) was the polynomial part of the 
exponential on the right-hand side. He deduced Hermite’s formula by taking F(x) = 

1 

so that 


 1—x? ‘ 


+1 1 — (am 
/ F(x) Ing —x)dx = / a, dx =al1n caper eee) (9.53) 
| a 


1 V1 —x2 2 


Chebyshev could then conclude that the polynomial f(z) in this case was in fact 
the polynomial part of 


pninttyeal ae, (= ‘) 
Si —— per 


and he wrote that it was equal to SET cos(n arccos z). He then considered the case 
where F(x) = 1, to obtain by (9.52): k = 2 and 


+1 (z+ 124! 
In(z — x) dx = In ————_ — 2 
L, : Gale 
Thus, Chebyshev arrived at the result he wanted: 
: 2 
/ P(x) dx = — (OO) + OG2) +--+ + OGn)), (9.54) 
-1 

where x1,x2,...,X, were the zeros of the polynomial given by the polynomial part of 


the expression 


z+ 24 n n n 
(z+ i 2-G —1) ae ; = 7"e 23:2 4524 6.726 
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He also computed the cases in which n = 2,3,4,5,6,7 to get the polynomials 


1 
2 3 
z Ze SZ a3 
2 


1 1 7. 119 149 
6_ 4,12 7 5, 3 
Be Ge ge ag ee ee Gage BAO 


He calculated the zeros of these polynomials to six decimal places. At this juncture, 
Chebyshev pointed out that in (9.54) the sum of the squares of the coefficients had 
the smallest possible value, because they were all equal; thus, his formula might 
sometimes even be an improvement on Gauss’s quadrature formula. 


9.10 Exercises 


(1) Let A, B,C, E,... be points on the x-axis and K,L,M,N,... corresponding 
points on the curve. Then the ordinates are given by AK, BL,CM,DN,.... 
Newton described the following formulas for the approximate area under the 
curve in the case of seven and nine ordinates: 


If there be seven ordinates there will come to be 


1 
Takes + 54BL+51CM+36DN + 51E0+4+54FP + 17GQ)AG = OAQ. 


While if there be given nine there will come 


(217AK+1024BL+352CM+1024DN+436E O+1024F P+352GQ+1024HR+2171S)AI __ AS 
5670 - 


Derive these formulas using Newton’s ideas, as explained in the text and 
as presented by Newton in his Of Quadrature by Ordinates. Recall that these 
formulas, making use of Newton’s assumption on the proportionality of the 
errors, differ from those obtained by integrating the interpolating polynomial. 
See Newton (1967-1981) vol. 7, p. 695, including Whiteside’s footnotes. 


(2) Suppose 
at bx tex? +--+ hxt! 
u= ‘ 
a+ Bxtyx2+---+0xm 
and u(x;,), x =0,1,2,...,n2-+m — 1. Determine the values of the coefficients 
ae B a er ert ee o. In particular, show that when m = 1 andn = 2, 


X=X2 Leap aay X-X is X—X0 
“OU Goan) an) 1 4042 Goma p@o—ay 1 142 Gap) @2—20) 


xo—X ; x1{—X jl x2—X 
40 Go=xGo—x) | “1 Gy—xp)@i—an) © 42 Ga—x0)G2— 1) 


Cauchy discussed this interpolation by rational functions after he deduced 
the Waring—Lagrange formula in his lectures. See Cauchy (1989) pp. 527-529. 
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(3) ee computed the zeros of the polynomials, z> — 323 + az and z® — 
amet ae — i; for use in (9.54). His results were 


+0.832497, +0.374541,0 and +0.866247. + 0.422519, + 0.266635. 


Check Chebyshev’s computations. 
(4) Show that Chebyshev’s result in (9.53) implies Hermite’s formula (9.47). 
(5) Prove the following formulas of Stieltjes: 


ms AN np. kt ka 
is V1l—x2 f(x) dx = a eas (cos =) + corr., 


ae ue ” ka 2kn 
<0): eet 
a tt ) dx err = 2 sin st (cos A) + oom 


The correction is zero when f (x) is a polynomial of degree < 2n — 1. 


1 
| _{o_ - FDS (00 gp Gk= at) toon, 


1 n 
k k 
[ Vx —x)f(x)dx = age Pd . “Sf (cos =) + corr., 


[ pues \d 21 5 sin? kr f 9 ka nm 
—— f (x) dx = ——— sin” ——— f { cos” ——— corr. 
0 x 2h Lites 2n+1 2n+1 


See Stieltjes (1993) vol. 1, pp. 514-515. 


9.11 Notes on the Literature 


Chapters 10 and 11 of J. L. Chabert (1999) contain interesting observations on 
interpolation and quadrature with excerpts from original authors. Thomas Harriot’s 
manuscript De Numeris Triangularibus, containing his derivations of symbolic inter- 
polation formulae and their applications, has now been published in Beery and 
Stedall (2009), almost four centuries after it was written. Beery and Stedall provide a 
commentary to accompany the almost completely nonverbal presentation of Harriot. 
They also discuss the work on interpolation of several British mathematicians in the 
period 1610-1670. 
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Series Transformation by Finite Differences 


10.1 Preliminary Remarks 


Around 1670, James Gregory found a large number of new infinite series, but his 
methods remain somewhat unclear. From circumstantial evidence and from the form 
of some of his series, it appears that he was the first mathematician to systematically 
make use of finite difference interpolation formulas in finding new infinite series. 
The work of Newton, Gregory, and Leibniz made the method of finite differences 
almost as important as calculus in the discovery of new infinite series. We observe 
that interpolation formulas usually deal with finite expressions because in practice 
the number of interpolating points is finite. By theoretically extending the number 
of points to infinity, Gregory found the binomial theorem, the Taylor series, and 
numerous interesting series involving trigonometric functions. Gregory most likely 
derived these theorems from the Gregory—Newton (or Harriot—Briggs) interpolation 
formula. Gregory’s letter to Collins, of November 23, 1670,! explicitly mentions these 
results, and also contains some other series that were not direct consequences of the 
Harriot—Briggs result.” Instead, these other series seem to require the Newton—Gauss 
interpolation formula; one is compelled to conclude that Gregory must have obtained 
this interpolation formula, though it is not given anywhere in his surviving notes 
and letters. In a separate enclosure with his letter to Collins, Gregory wrote several 
formulas, including:* 


Given an arc whose sine is d, and sine of the double arc is 2d — e, it is required to find another 
arc which bears to the arc whose sine is d the ratio a to c. The sine of the arc in question 


_ ad be i ke le? i (10.1) 
~ ¢ Cc cd cd2 : 
b  a(a?—c?) kaa —c*)(a? — 4c?) 


c 2-3-3 > ©) 2364-565 


’ 


! Turnbull (1938) pp. 118-132. 
2 ibid. pp. 129-130. 
3 ibid. p. 128. 
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In modern notation, c = rd, d = rsin@, 2d—e =r sin 260; hence, e = 2r sin6(1— 
cos 6) and the series takes the form 


2. 2 
goo = ane ae Oona cos @) 
c c 3103 
i es iD 
pO Se ME) oe cos0))? ++). (10.2) 
5!0 


Gregory noted at the end of the enclosure that an infinite number of other ways of 
measuring circular arcs could be deduced from his calculations. 

Gregory did not publish his work on series and his mathematical letters to Collins 
were not printed until later. Meanwhile, Newton developed his profound ideas on 
interpolation and finite differences starting in the mid-1670s. In the early 1680s, he 
applied the method of differences to infinite series and in June/July of 1684, he wrote 
two short treatises on the topic.* He was provoked into writing up his results upon 
receiving a work from David Gregory, the nephew of James, Exercitatio Geometrica 
de Dimensione Figurarum.’ In this treatise, David Gregory discussed several aspects 
of infinite series, apparently without knowledge of Newton’s work. Newton evidently 
wished to set the record straight; he first wrote Matheseos Universalis Specimina, 
in which he pointed out that James Gregory and Leibniz were indebted to him in 
their study of series. He did not finish this treatise, but instead started a new one, 
called De Computo Serierum in which he eliminated all references to Gregory and 
Leibniz. The first chapter of the second treatise dealt with infinite series in a manner 
similar to that of his early works of 1669 and 1671. However, the second chapter 
employed the entirely new idea of applying finite differences to derive an important 
transformation of infinite series,© often called Euler’s transformation. In modern 
notation, this is given by 


Aot + Ait? + Aot? +--+ = Aoz + AAgz? + A* Anz ++, (10.3) 


t 
where z = — AAo = Aj — Ao, A* Ap = A2 — 2A] + Ap, ete. 


Newton noted one remarkable special case of his transformation: 


1 
tan! + =1 rae is 
3 5 7 
t oe «Gh Piatt 2 \? 
= 1-4 
1+? 1-314? 1-3-5\14+? 
a Bs a (10.4) 
de Se Sep ADS 7 


4 Newton (1967-1981) vol. 4, pp. 526-653. 
5 Gregory (1684). 
© Newton (1967-1981) pp. 605-607. 
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2 1 


Observe that when tf = 1 we have int — 7 80 that while the first series converges 


very slowly for this value of t, the second series converges much more rapidly. 
Moreover when t = V3, the first series is divergent while the second is convergent. In 
fact, Newton wrote:’ 


The chief use for these transformations is to turn divergent series into convergent ones, and 
convergent series into ones more convergent. Series in which all terms are of the same sign 
cannot diverge without simultaneously coming to be infinitely great and on that account false. 
These, consequently, have no need to be turned into convergent ones. Those, however, in which 
the terms alternate in sign and proceed regularly, are so moderated by the successive addition and 
subtraction of those terms as to remain true even in divergence. But in their divergent form their 
quantity cannot be computed and they must be turned into convergent ones by the rule introduced, 
while when they are sluggishly convergent the rule must be applied to make them converge more 
swiftly. Thus the series y = x — 5x + 5x +++, when it converges or diverges slowly enough 
and has been turned into this one 

Peg Ge aes oth. A a ye crip, 


go ig age Tog. ge aS is. 


sgl gain ees ON eae 
i Tso” * agar Ts 


y=x 


will speedily enough be computed to many places of decimals. If the same series proves swiftly 
divergent it must be turned into the convergent 


pay, 
xy= — 
ae ioe * esas 


2-4 
ee (10.5) 


and then by what is presented in the following chapter it can be computed. It is, however, 
frequently convenient to reduce the coefficients A, B,C, ... to decimal fractions at the very start 
of the work. 


We note that Newton’s A, B, C,... are the Ag, Aj, Az... in (10.3) and z = ia 
as in (10.4). We do not know why Newton discontinued work on this treatise. Perhaps 
it was because Edmond Halley visited Cambridge in August 1684 and urged Newton 
to work on problems of planetary motion. As is well known, Newton started work on 
the Principia soon after this visit and for the next two years concentrated on this work 
with little respite. 

Newton’s transformation (10.4) for the arctangent series is obviously important, so 
it is not surprising that others rediscovered it, since Newton’s paper did not appear 
in print until 1970. In August 1704, Jakob Bernoulli communicated® the tf = 1 
case of (10.4) to Leibniz as a recent discovery of Jean Christophe Fatio de Duillier. 
Jakob Hermann, a student of Bernoulli, found a proof for this and sent it to Leibniz 
in January 1705. This proof is identical with that of Newton’s when specialized to 
t = 1. Johann Bernoulli, and probably others, succeeded in deriving the general 
form (10.4). Bernoulli, in fact, applied the general form in his 1742 notes on series” 
and thereby derived a remarkable series for 2* found earlier by Takebe Katahiro 


7 ibid. p. 611. 
8 Newton (1967-1981) vol. IV, p. 608, footnote 44. 
9 Joh. Bernoulli (1742) vol. IV, p. 25. 
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by a different technique.!° In 1717, the French mathematician Pierre Rémond de 
Montmort (1678-1719) rediscovered Newton’s more general transformation (10.3) 
with a different motivation and method of proof.!! 

Montmort was born into a wealthy family of the French nobility and was self-taught 
in mathematics. An admirer of both Newton and Leibniz, he remained neutral but 
friendly with followers of both mathematicians during the calculus priority dispute 
in the early eighteenth century. He mainly worked in probability and combinatorics 
but also made contributions to the theory of series. His paper on series was inspired 
by Brook Taylor’s 1715 work Methodus Incrementorum, consequently, Montmort’s 
De Seriebus Infinitis Tractatus was published in the Philosophical Transactions with 
an appendix by Taylor, then Secretary of the Royal Society. Montmort’s paper dealt 
with those finite as well as infinite series to which the method of differences could 
be applied. He first worked out the transformation of a finite power series and then 
obtained Newton’s formula (10.3) as a corollary. He also quoted from a 1715 letter 
of Niklaus I Bernoulli,!? showing that Niklaus had found a result similar to that of 
Newton. 

In 1717, Frangois Nicole (1683-1758), a pupil of Montmort, also published a 
paper! on finite differences. He too wrote that his ideas were suggested by Taylor’s 
book of 1715. The title of his paper, Traité du calcul des différences finies, indicated 
that he viewed the calculus of finite differences as a new topic in mathematics, separate 
from geometry, calculus, or algebra. By means of examples, he showed that the shifted 
factorial expression 


(X)n = x(x + h)(x + 2h)--- (x + (n — IA) (10.6) 
behaved under differencing as x” under differentiation. Thus, 
Ann = (& + A)n — )n = nh + h)n-1- (10.7) 


Also, the difference relation 


1 1 -_ nh 
(X)n (x+h)n > (ni 


(10.8) 


showed that the analog of x~” was apt Moreover, the inverse of a difference would 
be a sum. And just as the derivative of a function indicated the integral of the 
derived function, so also one could use the difference to find the sum. He gave an 
example: From 


x(x + 1)(x + 2) 
3 


f(x) = »fa+)—-f@)=@+)G4+2) (10.9) 


10 Smith and Mikami (1914) pp. 148-150. 
11 Montmort (1717). 

12 ibid. pp. 674-675. 

13 Nicole (1717). 
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he got 


1242-340-t204¢ = 27 IEF 46, (10.10) 


By taking x = 1, he obtained the constant as zero. 
In 1723, in his second memoir,'* Nicole discussed the problem of computing the 
coefficients ag, a1, a2,... in 
f(x +m) — f(x) = ao + ar (x + h) + age + h)@ + 2h) 
Hes + agi +h)---(@& + (k— Dh), (10.11) 


where f(x) = x(x +hA)---(x + (K — 1h). His method employed a long inductive 
process, but simpler procedures have since been found. In his second memoir and in 
his third memoir of 1724,!> Nicole solved a similar problem for the inverse factorial 


TFET 

Both Montmort and Nicole mentioned Taylor as the source of their inspiration; we 
note that Taylor gave a systematic exposition of finite differences and derivatives with 
their inverses, sums, and integrals. Many of these ideas were already known but Taylor 
explicitly laid out some of the concepts, such as the method of summation by parts. 
Ina letter of November 14, 1715,!© Montmort also attributed to Taylor the summation 
formula 

1 1 a a(a+d) 


=e Lge (10.12) 
b—-a b. bb+d) bb+d)(b+2d) 


There are several ways of proving (10.12), but it is likely that Taylor proved it by 
the Gregory—Newton interpolation formula, since he used this to prove his famous 
series. 

The Scottish mathematician James Stirling (1692-1770) took Nicole’s work much 
further. Stirling’s book, Methodus Differentialis, is sometimes called the first book 
on the calculus of finite differences. Like all prominent British mathematicians of the 
early eighteenth century, he was a disciple of Newton. His first paper, Lineae Tertii 
Ordinis Newtonianae,'’ was an account with some extensions of Newton’s theory 
of cubic curves. His second paper, Methodus Differentialis Newtoniana Ilustrata,'® 
developed some of Newton’s ideas on interpolation. He later expanded this paper into 
the Methodus. Stirling received his early education in Scotland. In 1710 he traveled to 
Oxford and graduated from Balliol College the same year. He stayed on at Oxford on 
a scholarship, but he lost his support after the first Jacobite rebellion of 1715, as his 
family had strong Jacobite sympathies. He then spent several years in Italy and was 
unable to obtain a professorship there. Although details of his time in Italy are largely 


14 Nicole (1723). 
15. Nicole (1724). 
16 See Bateman (1907) p. 368. 
'7 Stirling (1717). 
18 Stirling (1719). 
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unknown, his second paper was communicated from Venice. After returning to Britain 
in 1722, he was given assistance by Newton, making him one of Newton’s devoted 
friends. After teaching in a London school, in 1735 Stirling began service as manager 
of the Leadhills Mines in Scotland where he was very successful, looking after the 
welfare of the miners as well as the interests of the shareholders. In the early 1750s, 
he also surveyed the River Clyde in preparation for a series of navigational locks. 

Stirling started his book where Nicole ended. In the introduction, he defined the 
Stirling numbers of the first and second kinds. These numbers appeared as coefficients 
when z” was expanded in terms of z(z — 1)---(z —k + 1), and a was expanded in 
terms of ae ae These expansions were required in order to apply the method 
of differences to functions or quantities normally expressed in terms of powers of z. 
Stirling constructed two small tables of these coefficients to make the transformation 
easy to use. In the first few propositions of his book, Stirling considered problems 
similar to those of Nicole, but he very quickly enlarged the scope of those methods. 
He applied his new method to the approximate summation of series such as 1 + i + 
5 + i +.---, whose approximate value had also been computed by Daniel Bernoulli, 
Goldbach and Euler in the late 1720s. It was a little later that Euler brilliantly found 
the exact value of the series to be sa Stirling also applied his method of differ- 
ences to derive several new and interesting transformations of series. For example, 
observe propositions 7 and 8 of his Methodus Differentialis presented in modernized 
notation: 


eon Bete ace ee nm, ntl) -) 
"21—m) 2zt+tl)—m)2 ° — m "zm z(z+1)m?2 ° 
(10.13) 
and 
emg =n), Heme mt De - meant). 
az—-n+1) °° -e(zt-1(z—nt+1)(z—n4+2) 
= (14 nm | n(n + 1)m(m + 1) pe), (10.14) 
m zm+1) z(z+1)(m+1)(m + 2) 


Note that Newton’s transformation (10.4) for arctan t follows by taking n = 1, 
Zi 3, andm = 1+ a in (10.13). As we discuss in Chapter 23, these formulas 
are particular cases of transformations of hypergeometric series. The hypergeometric 
generalization of (10.13) was discovered by Pfaff in 1797,!° and the generalization of 
(10.14) was found by Kummer in 1836.7° Thus, the methods of hypergeometric series 
provide the right context with the appropriate degree of generality to study the series 
(10.4), (10.13), and (10.14). Moreover, Gauss extended Stirling’s finite difference 
method to the theory of hypergeometric series and derived his well-known and 


19. Pfaff (1797a). 
20 Kummer (1836). 
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important contiguous relations for hypergeometric series.7! Even today, contiguous 
relations continue to provide unexplored avenues for research. 

We also note that since expressions of the form z(z — 1) --- (g — k + 1) appear 
in finite difference interpolation formulas, Stirling numbers of the first kind also 
appear in those formulas. For this reason, in the early 1600s, Harriot computed these 
numbers. Stirling numbers also cropped up in Lagrange’s 1770s proof” of Wilson’s 
theorem that (p — 1)!+1 was divisible by p if and only if p was prime. In fact, 
Lagrange’s proof gave the first number theoretic discussion of Stirling numbers of the 
first kind. 

Like Gregory, Leibniz, Taylor, and Nicole, Euler saw the intimate connections 
between the calculus of finite differences and the calculus of differentiation and 
integration. His influential 1755 book on differential calculus, Institutiones Calculi 
Differentialis,”> began with a chapter on finite differences. The second chapter on 
the use of differences in the summation of series discussed examples such as those 
in Nicole’s work. In the second part of his book, Euler devoted the first chapter?+ 
to Newton’s transformation (10.3). He gave a proof different from Newton’s and 
from Montmort’s; this in turn led him to a further generalization of the formula. 
Suppose 


Euler then got the generalization 


Aodo + Ayayx + Ananx* + A3a3x° fee 


x dS wg aS sot wee aS 
= ApS+ AA + A“A ae 10.15 
Y OT! dx OO! dx2 °3! dx3 ee 
The Newton—Montmort formula followed by taking a9 = a) = a2 =--:- = 1. 


10.2. Newton’s Transformation 


In his 1684 Matheseos, Newton attempted to change slowly convergent series into 
more rapidly convergent ones. He considered the method of taking differences of 
the coefficients, but it was not until a little later that he arrived at the explicit 
and useful transformation (10.3) contained in the De Computo. He wrote the initial 
series as 


v=AttBPe+Cr+Di+EP t Fre &. (10.16) 


21 Gauss (1813). 

ae Lagrange (1771). 

23 Bu. I-10. E 212. 

24 Bu. I-10), chapter 1. E 212. 
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Newton explained his transformation:*> 


Here A, B, C,... are to denote the coefficients of the terms whose ultimate ratio, if the series be 
extended infinitely, is one of equality, and 1 to f is the remaining ratio of the terms; while the sign 
+ is ambiguous and the converse of the sign +. Collect the first differences b, b2, b3,... of the 


terms A, B, C, ...; then their second ones c, c2, c3,..., third ones d, dp, d3, ..., and any number 
of following ones. Collect these, however, by always taking a latter term from the previous one: B 
from A, C from B,...; bz from b, b3 from bo, ... ; dz from d, d3 from d3, ..., and so on. Then 
make Ta 7 = < and when the signs are appropriately observed there will be 


v= AzFbe tc Fdzt+e2 =F fro+-:-. 


He took the differences in reverse order of the modern convention. He had A — B, 
B-—C,C —D,... for the first differences instead of B— A,C — B,D—C,... 
and similarly for the higher-order differences. The revised version of the De Computo 
did not include a proof but notes of an earlier version suggest the following iterative 
argument: 


v=z1Frn(A£Bt+Cr+DPh+EC+FP---) 
= 2(A = (A — B)t — (B—C)t? = (C — Dy? —(D- E)t* = ---) 
= Az=z((A— Bt + (B-—C)t?+(C— D)tP +(D— E)t*+---). (10.17) 
Now in (10.17), the last series in parentheses is of the same form as the original 
series except that the coefficients are the differences of the coefficients of the original. 
So the procedure can be repeated to give 
v= AzFz7(z(A— B) Fz((A-—2B4+C)t 
+(B-2C4+D)t??+(C-2D+E)P+#--:)) 
= Az#(A— B)z’ + (A-2B+C)z #---. (10.18) 


The final formula results from an infinite number of applications of this procedure; 
Newton applied this formula to the logarithmic and arctangent series. In the case of the 


logarithm, the transformation amounted to the equation In(1 + x) = —In (1 i x) ‘ 
Newton’s main purpose was to use the transformation for numerical computation and 
this explains why he applied the transformation after the eighth term 75x) in (10.5) 
rather than immediately at the outset. Note also that the first step in (10.17) was an 
example of the summation by parts discussed explicitly by Taylor and later used by 
Abel to study the convergence of series 


10.3. Montmort’s Transformation 


In his 1717 paper, De Seriebus Infinitis Tractatus, Montmort started with elementary 
examples, but toward the end of the paper he posed the problem of summing or 
transforming the series 


25 Newton (1967-1981) vol. 4, pp. 605-607. 
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ao a| a2 
cs ee 10.19 
h kh bh ( ) 


He also discussed partial sums of this series, written as 


Ag Al Ap 
So= RUS poe er | a ’ (10.20) 
where 
Ag =ado, A, =aoh+a, Az =agh? +ajh+ay,.... (10.21) 


He noted that a simple relation existed between the differences of the sequence 
Ao, Aj, A2,... and the sequence ao,a1,a2,.... He wrote down just the first three 
cases: Forg =h —1, 


AAo = hag + Aao, 
A? Ao = ghayp +hAag + A? ao, 
A3 Ao = q°hag + ghAap + hA*ag + Af ao, etc. 


It is not difficult to write out the general relation from these examples. He then 
proved the result that if for k > J, Akao = 0, then 


AK Ao = qk "A! Ao. (10.22) 
In fact, he verified this for ] = 1 and 2, but noted that was sufficient to see that 


the result was true in general. Montmort then proceeded to evaluate the partial sums 
of (10.19) under the assumption that A’ao = 0 for some positive integer /. The basic 


result used here and in other examples was that for any sequence bo, b1,b2,... anda 
positive integer p, 
by = bo + ) Abo + (.) My pei ) AP bo. (10.23) 


For example, to evaluate A, for p > / and Alay = 0, he employed (10.22) to 


obtain 
Ap = (40+ (‘) AAg +--+ (, f i) al'Ao) 
P\ AI DON Bigg ce ik PPV Ap 
+(PJatao+(,? Ja Ag+ + (2) arac 


= Ay + (1) Aaot-+()?) alto 


A'Ao ((P\ 1 Pp i41 Pp 
wae Pp 
ore (Oe ae, 
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1 
- (" 1 (")4 ee ( i i) d), (10.24) 


As an application of this formula, he gave the sum of the finite series 


1 Se Bt 6-3? 4 1063" 154 3? 4 O11. 3° 


Here h = i g= —§, p = 5, and A3ag = 0. Next, as a corollary, Montmort stated 
without proof the transformation formula 
fore) fore) k 
a A*“ado 
\ = 10.25) 
eal en 
k=0 h k=0 Ga} 


Of course, (10.25) is Newton’s formula. Since equation (10.23) was Montmort’s basic 
formula, we may assume that he applied it to give a formal calculation to justify 
(10.25). Indeed, this is easy to do: 


ak a Sh é 
Yet La (j) aia 
k=0 j=0 
(oe) CO 
k+ 1 

= k J 

=2- Da j \ a 
: (h = 1A+P? 


where the last step used the binomial theorem for negative integer exponents. 

Unfortunately, he did not give any interesting examples of this formula. The three 
cases he explicitly mentioned follow from the binomial theorem just as easily. We 
mention that Zhu Shijie (c. 1260-1320), also known as Chu Shi-Chieh, knew (10.23) 
and used it to sum finite series in his Siyuan Yujian of 1303.76 


10.4 Euler’s Transformation Formula 


As we mentioned earlier, Newton did not publish his transformation formula. It is not 
certain whether or not Euler saw Montmort’s paper on this topic. In any case, Euler’s 
approach differed from those of Newton and Montmort. Euler’s proof of (10.3) applied 
the change of variables in the first step. We present the proof as Euler set it out.”” He let 


S =ax + bx? +x? + dx* + ex? + &e. 


26 Hoe (2007) p. 401. 
27 Bu. I-10> sections 2 and 3. E 212. 
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and let x = i and replaced the powers of x by the series 


xay-yty—-yt+y?— yet &e., 

x? = y* —2y? + 3y* —4y° + Sy® — by’ + &e., 
x3 = y? —3y4 + 6y° — 10y® + 15y7 — 21y8 + &c., 
x4 = yt — ay? + 10y® — 20y7 + 35y8 — Soy? + &e. 


Thus, 
S = ay ay” ay? ay* ay> &e. 
2b +3b—4b 
c 3c + 6c 
+d — 4d. 


Note that the coefficients of the various powers of y were presented in columns. Since 


at 
YS Fags 


Xx x? x3 


ae + (b Orage t (c — 2b4 aaa + &C., 


yielding the transformation formula. 

Note that the series for x is the geometric series, while the series for Cae ee 
can be obtained by the binomial theorem or by the differentiation of the series for x. 
In fact, Euler must have had differentiation in mind, since his proof of the second 
transformation formula (10.15) was obtained® by writing the right-hand side of 
(10.15) as 


x dS x2 d*8 x? ds 
aS+ 6p | | 


dx |) DY dx?" °3! x3 | 
and then substituting the series for S, as ay .... By equating the coefficients of the 


various powers of x, he found a = Ao, 6B = A, — Ao, y = A2 — 2A; + Ao, and so 
on. 

Euler gave several examples of these formulas in his differential calculus book. At 
times writing xx and at other times using x, he considered the problem of summing 
the series 


Ix + 4xx +9x? + 16x* + 25x° 4+ ete. (10.26) 


The first and second differences of the coefficients 1, 4,9, 16, 25,... were 3,5,7,9,... 
and 2, 2, 2,.... Therefore, the third differences were zero, and by equation (10.3), the 
sum of the series (10.26) came out to be 


28 Bu. I-10), section 26. E 212. 
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x | 3x? | 2x3 _*x + xx (10.27) 
1-x (l—-x) (1-~x)3 (l—x)3 
To sum the finite series 
iy 4 Ae? 0x? 4 téxt ass”, (10.28) 


Euler subtracted (n + 1)?x”*+! + (n +2)?x"t? +... from (10.26), the series in the 
previous example. He found the sum of this infinite series to be 


5 9 x 2x? 
—(« T 1) T (2n Parr T 3): (10.29) 


Euler observed the general rule, already established by Montmort, that if a power 
series had coefficients such that A” Ag = 0 for n > k, then the series would sum to a 
rational function. 

When Euler discovered a new method of summation or a new transformation of 
series, he applied it to divergent as well as convergent series. He believed that divergent 
series could be studied and used in a meaningful way. He explained that whenever 
he assigned a sum to a divergent series by a given method, he arrived at the same 
sum by alternative methods, leading him to conclude that divergent series could be 
legitimately summed. Applying the differences method, he found the sums of various 


divergent series, including 1 -4+9-—16+25—.---; In2—In3+In4—In5-+4 .---; 
and | 2+ 6 — 24 + 120 — 720 + ---. Observe that the terms in the third 
example are 1!, 2!, 3!, 4!,.... This was one of Euler’s favorite divergent series. By 


taking twelve terms of the transformed series and using (10.3), he found the sum to 
be 0.4036524077.7° He must have later realized that this sum was very inaccurate, 
since he reconsidered it in a 1760 paper.*° According to Jacobi’s remarks, included in 
the Euler Archive in connection with this paper, this 1760 paper was read to the Berlin 
Academy in October, 1746. This information tallies with Euler’s letter to Goldbach of 
August 7, 1746, in which he discussed the series 1 — 1+ 2 -6+24—120+.---;to 
obtain the value of this series, he took the function 


faye leole lls ste 
and proceeded to express f(x) as an integral and as a continued fraction: 
if? 12a 
f(x) = ae 4 dt; 
x Jo t 


f= x He OK Ee Oe BE 
cats ee Se ie Wee Ee ae 


Euler then computed the value of the original series f(1). To approximately 
evaluate the integral, he divided the interval [0,1] into ten equal parts and used the 


29 ibid. section 10. 
30 Bu. 1-14 pp. 585-617. E 247. 
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Figure 10.1 Difference table. 


trapezoidal rule. By this method, f(1) = 0.59637255. The continued fraction, on 
the other hand, gave f(1) = 0.59637362123, correct to eight decimal places. Even 
earlier, in a letter to Goldbach of August 7, 1745,3! Euler wrote that he had worked 
out the continued fraction for the divergent series f(1) and found the value to be 
approximately 0.5963475922, adding the remark that in a small dispute with Niklaus 
I Bernoulli about the value of a divergent series, he himself had argued that all series 
such as f (1) must have a definite value.22 

In his differential calculus book, Euler gave a few applications of his more general 
formula (10.15), including the derivation of the exponential generating function of a 
sequence whose third difference was zero.*> He began with the difference table in 
Figure 10.1. 

From this, he derived 


24+5x4 5 6 rT + ete. = e*(24+3x+xx) =e*(1+x)(24+x). 


More generally, he noted that when S = e”, the result was 


Xx x2 x? x4 
AAG Peleg ' Oqoe.  Aqaoaaa 
=e (do + TAA0-+ 2 582A0 4 x REN a hE ste at.) 
1 1-2 r23 [2933.4 


In fact, this result is equivalent to (1 + A)"Ap = Apj, that is (10.23), as is 
quickly verified by writing e* as a series and multiplying the two series. Jacobi gave 
a very interesting application of Euler’s general transformation formula (10.15) to 
the derivation of the Pfaff—Gauss transformation for hypergeometric functions and we 
discuss this in Chapter 23. 


10.5 Stirling’s Transformation Formulas 


Stirling’s new generalization of Newton’s transformation of the arctangent series 
(10.4) was a particular case of a hypergeometric transformation discovered by 


31 Fuss (1968) vol. I, pp. 323-328. 
For Euler’s comments on summing divergent series, see his letters to Niklaus I Bernoulli: Eu. [VA-2 
pp. 579-643, especially pp. 589-590 and pp. 604-606. 

33 Eu. I-105 section 27. E 212. 
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Pfaff in 1797.*4 Stirling stated his formula®> as proposition 7 of his 1730 book 
and his proof made remarkable use of difference equations. Beginning with a series 
satisfying a certain difference equation, or recurrence relation, he then showed that 
the transformed series satisfied the same difference equation. Adhering closely to 
Stirling’s exposition, we state the theorem: If the successive terms T and T’ of a series 
S satisfied (z —n)T + (m — 1)zT’ = 0, then S could be transformed to 

m—1 moA ntl Bue CC . nHe3 


S= T+ x | 


x T x t x + etc. 
m Zz m zt+i m zt+2 Mm zt+3 m 


Stirling’s notation made unusual use of the symbol z. In the equation (z — n)T + 
(m — 1)zT' = 0, T and T’ represented any two successive terms of the series S. The 
value of z changed by one when Stirling moved from one pair to the next. To see how 
this worked, take S = 7) + T; + I. + 73 +--- . The initial relation (between the first 
two terms) could then be expressed as (z — n)Tp + (m — 1)zT; = O and, in general, 
the relation between two successive terms would be 


z—nt+k 


Tk41 = : 10.30 
HE" @+Hm—1)* or 
Thus, the relation between successive terms produced the series 
— — —_ 1 
s=m(14 Z—n | (z—n)(z a ree), (10.31) 
gi=-m)" 2e-F-#Dd—n) 
In Stirling’s notation, A, B, C,... each represented the previous term so that 
m—1 m—1 n A n+1 B 
A = —-T(= To), B=-x—,C= x —, etc. 
m m Zz omM z+1 m 
Hence, the transformed series could be written as 
1 +1 1 +1 2 1 
m(1+%, eu ). ; rau )(n +2) : bee), (10.32) 
m 2z2+1) m 2(Zzt+1)(z+2) m 


Thus, Stirling’s transformation formula is equivalent to the statement that the series 
in (10.31) equals the series in (10.32). We use modern notation to derive Stirling’s 
difference equation. Let 


[o,6) 
Sk = SS Th — TkYk 
n=k 


so that 


Tri Yk+1 = Seti = Se — Th = Tk(ye — 1). (10.33) 


34 Pfaff (1797a). 
35 Stirling and Tweddle (2003) pp. 57-60. 
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By relation (10.30), (10.33) would become 


n 
(m — l)ye + Ye41 Yer1 —-m+1=0. (10.34) 
z+k 

Since Stirling wrote y and y’ for any two successive yz and yz+1, he could write z 


instead of z + k to get the recurrence relation 


(m—l)yt+y’ Py m+1=0. (10.35) 
z 


In proving his transformation formula, Stirling assumed that 


ae c | d [se 
yr eT TT etd ezZt+DE+2) — 


be (10.36) 


so he had 
b Cc d 


vet a Gees) Gene nuae. 


(10.37) 


Before substituting these expressions for y and y’ in (10.35), Stirling observed that 


b c—b d —2c e— 3d 
y=a+-4 eae 


z 2ztl). eztD(z+2). zz t+ D(z+2\(z+3) 
(10.38) 


an equation we can easily see to be true by taking the term-by-term difference of 
the series for y and y’ in (10.36) and (10.37). Next, he substituted the expression 
(10.38) for y’ in (10.35) but used (10.37) for the term = in (10.35). After these 
substitutions, equation (10.35) became 

mb—na  mc—(n+1)b é md — (n+ 2)c 


ma—-m+14 } t 
z z(z+1) z(z + 1)(z + 2) 


On setting the like terms equal to zero, he got 


24 1 2 
Pe eens SE i Pe 


n 
m m m m 


When these values were substituted back in y, Stirling got his result. 
Stirling applied this transformation to the approximate summation of the series 


1 1 1-2 1-2-3 1-2-3-4 
(1 ). (10.39) 


3 3-5 3-5-7 3-5-7-9 


Now this is the series one gets upon taking t = 1 in Newton’s formula (10.4), 
and its value is therefore 7. It is interesting to note that, after posing the problem 
of approximately summing the series (10.39), Newton had abruptly ended the 


second chapter of his unpublished Matheseos of 1684 with the word “Inveniend” 
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(to be found).*° Thus, although the Matheseos went unpublished, Stirling took up 
the very problem left pending by Newton. Stirling began by adding the first twelve 
terms of the series and applied his transformation (10.13) to the remaining (infinite) 
part of the series, yielding 


12! 1 1-3 1-3-5 
1 ee Ue (10.40) 
3-5---25 27 27-29 27-29-31 


To approximate this sum, he took the first twelve terms of this series and found the 
value of 7 as 0.78539816339. Since the terms are alternating and decreasing, Stirling 
could also have easily determined bounds on the error by using results of Leibniz 
dating from the 1680s. However, Leibniz’s results, though communicated to Johann 
Bernoulli in 1713-14,” went unpublished for a long period. 

Proposition 8°° of Stirling’s Methodus Differentialis was a transformation of 
what we now call a generalized hypergeometric series, noteworthy as an impor- 
tant particular case of a formula discovered by Kummer in 1836%? in the course 
of his efforts to generalize Gauss’s 1813 theory of hypergeometric series. After 
having shown the manner in which Stirling stated his propositions, we now state 
Stirling’s eighth proposition in a form more immediately understandable to modern 
readers: 


Lo gam, (= mG—mtl) , G—m)\e—m+IG—m42) | 
zn 2z—n+l) 2@+D@—n4+2).  2@+DG+2DE—n43) | 
tn Sens n n(n + 1) : n(n + 1)(n+ 2) 

~m' zm+l) 2ze+lim+2) zetD@+t2«M+3) 


(10.41) 


Let S; denote the sum of the series on the left after the first k terms have been 
removed: 
_ Gam) G—m+tk=1) 
tle eee ho) 


1 : z—-m+k 
(—S + n+k+1) 
(z—-m+k)\(z—-m+k+1) 
" @EDEHKEF DE aD) 


Denote the sum in parentheses by y,;. It is simple to check that 


| 1 =, 
z—n+k 


Yk Yk+1 T 0, 


m 
k+1 
aay 


36 Newton (1967-1981) vol. IV, pp. 552-553. 
37 Leibniz (1971) vol. 3, pp. 922-923. 

38 Stirling and Tweddle (2003) pp. 60-64. 

39 Kummer (1836). 
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a relation Stirling wrote as 


Vy y= . (10.42) 


He then assumed that for some a, b, c, d,... 


a b Cc d 
m (m+1)z° (m+22zt) | (m+3)ez+ IZ +2) 


fee, ate 
T ’ 


y= 


substituting this into the left side of (10.42), after a simple calculation, he obtained 


a4 MP te 5S | b | e | d | 
ee 2 eet) e@tDE+2) 2z@+DEF+DEHD 
(10.43) 
Stirling applied Taylor’s formula (10.12) to the right side of (10.42) to get 
ee | n ; n(n + 1) n(n + 1)(n+ 2) Lees (10.44) 


zon 2) cet) 2etDE+2)° ce+DE+DE+H3) - 
and equated the coefficients in (10.43) and (10.44) to obtain 


a=1,b=n,c=n(n+1), d=n(n4+1)"+2),.... 


This proves the transformation formula (10.41). As we mentioned, a century later 
Kummer obtained a more general result but he did not seem to have been aware of 
Stirling’s work. 


Newton pointed out in his second letter for Leibniz,“ of October 24, 1676, that 


sin”! 5 was more convenient than sin™! we for computing z because it converged 


rapidly. Clearly, sin! 1 would not give a result even as good as sin™! ve Never- 
theless, Stirling’s transformation was so powerful that, when applied to sin~! 1, it 
caused it to converge rapidly enough to be useful for computation. Stirling summed 
up the first twelve terms directly and then applied the transformation to the remainder, 
thereby achieving the value of z to eight decimal places. 


10.6 Nicole’s Examples of Sums 


The method of finite differences is also useful in the summation of series, as noted 
by Mengoli, Leibniz, Jakob Bernoulli, and Montmort. Nicole, student of Montmort, 
wished to establish a new subject devoted to the calculus of finite differences. 
Analogous to integration in the calculus, summation of series had to be developed 
within this new subject. Nicole attacked this problem by summing examples of certain 
kinds of series and Montmort gave similar examples. The basic idea was that, given a 
function g(x) such that g(x +h) — g(x) = f(x), the sum would be 


40 Newton (1959-1960) vol. II, p. 139. 
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fia+ flath)t+---+ fat@a— Dh) = g(at+nh) - g(a). 


Nicole’s examples took f(x) = (*)m or f(x) = as where (x), was given by 
(10.6). So, by (10.7) and (10.8), g(x) could be taken to be, in the first case, 


1 
SS tye Sop 
RED Oe 
and, in the second case, 


-1 1 
h(m—1)) ()m—1 


Thus, the sum g(a + nh) — g(a) would be, in the first case, 


1 
h(m +1) 


((a+ (2 — Dh)m41 — (4 — h)m+1) 


and, in the second case, 


1 ( 1 1 ) 
h(m—1) \(@)m-1 (a+ nh)m—1)~ 


Although he did not provide the general formulas, Nicole worked out special cases. 


When the sum was indefinite, Nicole wrote that the sum of the f(x) was g(x).4! 
x(x+2)(x+4)(x+6) 
8 


For example, the sum of the terms (x + 2)(x + 4)(x + 6) was 
because 


(x +2)(x 4 oe + 6)(x + 8) separ) =~ EDKEN +O. 


In a similar manner, because 


1 1 1 1 
x(a t2(x+4(x+6) 6 +2444) (; 3) 
1 1 
~ 6x +QDx+4 OO +2DAX+D +6)’ 


(10.45) 


1 1 
he wrote that the sum of terms TGtDGTDAL would be oxGFDATS* 


As an application of the first type of sum, Nicole considered the series 


1-4-7-10+4-7-10-134+7-10-13-16+10-13-16-194+4 etc. 


He gave the general term as (x + 3)(x + 6)(x + 9)(x + 12) and its integral (sum) as 
AO+DGFO) =. Now to find the constant of integration, note that the starting 


41 Nicole (1717). 
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value was x = —2 and the corresponding value of the integral was aa sees 


Thus, Nicole got the value of the series as 


x(X+3)\44+6)44+9)%412) | 2-1-4-7-10 


15 15 


Similarly, he computed the infinite series 


1 1 1 1 


(Be Be RTO ei ~ F201 


1 
x(x + 2)(x +4)(x + 6) etc. (10.46) 


by using (10.45) and observing that 


1 1 
+ ete. 
X(X+2X+4A+6)  W+2)K4+4) 44+ 6)H +8) 
1 
 6(x + 2)(@+4)(x + 6)” 
Since the original sum started at x = 1, Nicole gave its value as os a 30° As 


another example, Nicole then considered a slightly more difficult sum: 


4 sae 
ey weave Real mame Oh Pees gues pee 


+ etc. 


2 2 
Note that the general term was aa oss where x was replaced by x + 3 as 


one moved from one term to the next. Nicole wrote the numerator as 
= ( 
36 


and found the coefficients A to E by taking x = 0, — 3, — 6, and —9. Nicole then 
expressed the general term as 


A+ Bx + Cx(x +3) + Dx(x + 3) +6) + Exe +3) + 6)(@ +9) 


“( A | | E ) 
36 \x(x +3). @ +15) | (x +: 12)(x +15) 


and wrote the integral as 


1 A E 
Peed (10.47) 
36 (a AB 19) 3(x + =) 


He found the sum by taking x = 1 in the sum, or integral, as he called it. 
Series such as (10.46) are called inverse factorial series. Nicole treated these 
examples in a 1717 paper on finite differences.** He further developed inverse factorial 


42 Nicole (1717). 
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series in papers of 1723, 1724, and 1727. In his 1727 paper,*? he gave a generalization 
of (10.12): 


1 ol a pas oh ; a a(a+c) 
b—-a b' bbb—a) b° b(b+c) bb+cb—a) 
anil a a(a+c) a(a+c)(a+d) (10.48) 
~ b* bbt+c)* bb+oQb+d) blb+c)(b+d)(b—a)’ 
In modern terminology, with co = 0, we can write (10.48) as 
both 5S a(a+ci)---@tece-1) | a(atei)-+-(@ten) 
b-a_ b 4 (b+ cb +02) (b+ CK) (B+ C1) (b+ en)(b— a) } 
(10.49) 


In a letter to Goldbach** dated July 26, 1746, Euler wrote, without reference to 
Nicole, that he had an infinite version of (10.49): Taking c; > 0, i = 1,2,3,..., and 
with + s + 4 +--+ divergent, 


1 1 a a(a +c) 


=a b HbA). bo Leyes) 


As a special case, he took c1,c2,c3, ... to be primes starting with 1,2,3,5,...,b = 
4, and a = 3 to obtain 
1 3 3-4 3-4-6 


l=-4 bos 
4 Aes. Ass 7 4253959 


Euler gave no proof in his letter to Goldbach, but in his reply,45 Goldbach 
reproduced, without reference, the argument of Nicole, but without Euler’s condition 
that >° = + must be divergent. Although Euler’s letter did not refer to Nicole, he wrote 
that (10. 49) was a generalization of Stirling’s result 

1 1 a ; a(a+1) a(a+1)(a+2) 


b—a b. bbb+1). bb+HDG+2). bO+Db+2H643).. 


given as Example | of Proposition 5*° of Stirling’s Methodus Differentialis. It is not 


clear why Euler specified that the series }* = must diverge, since clearly one could 


take, for example, c, = n. 


43 Nicole (1727). 

44 Fuss (1968) vol. I, pp. 388-392, especially p. 392. 
45. ibid. pp. 394-396. 

46 Stirling and Tweddle (2003) p. 52. 
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10.7 Stirling Numbers 


In the introduction to his Methodus Differentialis, Stirling explained that the series 
satisfying difference equations were best expressed by using terms of the factorial or 
inverse factorial form z(z — 1)---(z—m-+1), or J 


1 : m 
ha OF 241) GemaD» instead of 2™ or ar. 
He defined the Stirling numbers in order to facilitate the conversion of series expressed 
in powers of z into series with terms in factorial form. He gave two tables of numbers, 
the second consisting of what we now call the signless Stirling numbers of the first 
kind:47 


1 

1 1 

2 3 1 

6 11 6 1 

24 50 35 10 1 


120 274 225 85 15 1 

720 1764 1624 735 175 21 1 

5040 =13068 )=6.:13132, 6769) = -«1960 3322 28d 
40320 109584 118124 67284 22449 4536 546 36 1 


Stirling described how to construct the table: “Multiply the terms of this progression 
n,l+n,2+n,3-+n, etc. repeatedly by themselves, and let the results be arranged 
in the following table in order of the powers of the number n, only the coefficients 
having been retained.” Thus, to get the fourth row take n(n + 1)(n + 2)(n +3) = 
6n + 11n? + 6n? + n* and the coefficients will be the numbers 6, 11, 6, 1 in the 
fourth row. 

Stirling applied these numbers to the expansion of ol ,n = 1,2,3,... as an inverse 
factorial series. The numbers in the first column then appeared as numerators in the 
expansion of +4; the numbers in the second column appeared in that of +, and so on. 
He wrote down three expansions explicitly: ; 


1 1 ( fe ot 2 6 


| wiles 
T ’ 


14 
are - Pee). Lee 2) eee Sa) 


z 2(z+1) 
(10.50) 
1 1 (1. ae ll | ) (0st) 
2B zz+i(z+2)\ z4+3. (2 +3)(zZ+4) ~ 
1 1 
4 2g+DG+2E +3) 
6 35 225 
} ~}. (10.52 
( z+4 @+4 E45) @+HE+5 E46) ac.) ee 


47 Stirling and Tweddle (2003) p. 29. 
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In his Methodus, Proposition 2, Example 6,** Stirling applied (10.50) to find an 
approximate value of )°?- zm he observed that 


ie eS (eee ee) 
+k? &HA\GHOGE+EFD " GFDEFEEDEFE+D) 
er ernieeeD o> (+k FKEEDE+KEH2) © 
(10.53) 


Since the sums on the right-hand side of (10.53) can be evaluated by the Montmort- 
Nicole method mentioned earlier, the second of those sums would have the value 


es 1 
ies ee: I(z+k+2) 


coe 1 1 
=> 3(ape (Z+k-4 wcEEES) 
1 


a 22(z +1) 


By treating the other infinite sums in a similar manner, Stirling could thus write that 


se 1 1 i 2 6 
2 (Z+k z - 2z(z + 1) BRC + 1)(z+ 2) as 4z(z + 1)(z + 2)(z + 3) 
= k! 
= a ; (10.54) 


(k+1)z(z+1)---(z +k) 


Stirling took z = 13 and then took thirteen terms of the right-hand side of (10.54) 
to get the approximation 


ae 
Do gw © 0.079957427. (10.55) 
k=13 


He then added )> = 1 a , asum that approximately equals 1. uae to oe 55), 
to arrive at 1.644934065 as the approximate bes for the series 1 + 7 it% s+] wo etc. 


Euler showed that this series was equivalent to = “, implying that Stisting s vale was 
correct to eight decimal places. 


48 ibid. p. 46. 
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To better understand Stirling’s table, let us utilize Richard Stanley’s notation’? to 
denote by c(m,k) the unsigned Stirling numbers defined by 


m 


2g t+ 1)@+2)--@+m=-D) =) cm, wz. (10.56) 
k=1 


When m = 4, as Stirling observed, we get the numbers in the fourth row: 
c(4,1) =6, c4,2) =11, c43)=6, c44 =1. 


We set c(m,0) = 0 and c(m,k) = 0 for k > m. Thus, Stirling was basically stating 
that 


1 = c(m,n) 
—_ = ; 10.57 


m>=n 


Stirling did not write down a proof of (10.57), but it is obvious that he must have had 


one. Although we cannot know for sure, Stirling’s argument may have gone something 


like this: First expand oH as a factorial series with unknown coefficients byy,n: 


1 _ b(n,n) b(n+ 1,n) b(m,n) 
gel (zt l)-s-(tn)) ezt1)-s-@tnth ' Zees(z+m) | 
(10.58) 
Then multiply (10.58) by the left-hand side of (10.56) to obtain 
+1)---(+m—1 
ee “ mat = d(n,n)\(z+m)--- (+n) 
Zz 
bim, b 1, 
Bhs ona OER Sete. 40150) 


zt+m | (z+m)(z+m-+ 1) - 


Substituting yop c(m,k)z* for the numerator of the left-hand side of (10.59) 
produces 


c(m, 1) (aed Poet c(m,n) t++++c(m,m)z 


m—n—-1 
T ° 
Ze Zz 


When the right-hand side of (10.59) is expanded in powers of z and i, then 1 has 
b(m,n) as its only coefficient; hence 


b(m,n) = c(m,n). 


49 Stanley (2012) vol. I, pp. 26-27. 


236 Series Transformation by Finite Differences 
We note that the Stirling numbers of the first kind, s(m,n), are defined by 
s(m,n) = (-1)”""c(m,n). (10.60) 


Stirling’s first table*? was used to construct what we now call Stirling numbers of 
the second kind: 


1 1 1 1 1 1 1 1 1 
13 7 15 31 #63 127 255 

1 6 25 90 301 966 3025 

1 10 65 350 1701 7770 


1 28 462 
1 36 
1 


in which the first row represented the coefficients of the powers of i in the expansion 


of 4: The second row gave the coefficients of the powers of i in the expansion of 


(z—1)(z-2) 2 "ze | ae 


1 3 7 
See a 


Paes, 


Again using Stanley’s notation,>! the mth row was found by the expansion of 


(10.61) 


1 *. S(n,m) 
=m 


@>@=2)-s em)" 4 gn 
Stirling then gave an expansion of z” in terms of a factorial expansion using 


numbers of the second kind: 


m 
z= > Sim, kz — 1) + (@-k +1) (10.62) 
k=1 
by giving examples for m = 1,2,...,5. For example, the fourth column gave him the 


expansion for z*: 


gt z4+-7e(z — 1) + 6e(z — I) (ze — 2) +. ez — I(z — 2)(z — 3). 


50 Stirling and Tweddle (2003) p. 26. 
51 Stanley (2012) vol. I, p. 73. 
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Again, Stirling did not suggest a proof for formula (10.62), though it would 
appear that he must have had one in mind. Observe that (10.61) implies that for 
n>=m, 

n 
z _ S(m,m)z"— et) 
ghar Lots tt) 


S(n,m) haste 
z T 


+ S(m + 1,m)z?— 4) 4... 4 (10.63) 


Now we must prove (10.62); let us denote the coefficients on its right-hand side by 
b(m,k). Then we can write the left-hand side of (10.63) as 


Dik=1 OM KH2@—- Vo mt)) _ b(n, 1) mn 
a(z= 1) (@ =m) “= 1) (Z =m) 
b(n,m) 
+++» + d(n,n)(z —(m+1))---(¢-n+1). (10.64) 


Z—-m 


Comparing coefficients of i on the right-hand sides of (10.63) and (10.64) produces 
b(n,m) = S(n,m), verifying (10.62). 

There is now a well-known formula for Stirling numbers of the second kind, of 
which Stirling may have been aware, namely: 


S(m,n) = — yo (-pk & km, (10.65) 
“ k=0 


To validate this formula, take f(z) = z” in (9.24), which is the Gregory-Newton 
formula stated as Proposition 19 of the Methodus Differentialis, to obtain 


mM Ano” 
Qe 2(z—1)++-(@-n +1), (10.66) 
n=1 


n! 
because A”0” = 0 when n > m. Comparing (10.66) with (10.62), and using (9.26), 
the proof is complete: 


n 


AQ |)" 
S(m,n) = — =—? 2 1k oe (10.67) 


If we take the combinatorial interpretation of S(m,n), rather than Stirling’s 
definition given in (10.61), then we can say that (10.67) was derived by de Moivre 
in his “Mensura Sortis” of 1711/1712. I am indebted to James Pitman for pointing this 
out to me. 

It is instructive to understand in more detail Stirling’s construction of his tables. 
In general terms, for the signless numbers of the first kind c(n,k), he wrote that one 
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should first obtain the first (7 — 1) rows and then find the nth row by multiplying the 
polynomial for the (n — 1)th row by n — 1 + z. We note that Stirling’s description 
implies that 


(cn —1,Dz+e(n—1,2)2? +--+ e(n— Ln — Yz""')(n- 142) 
=nc(n—1,1)z4 (nc(n 1,2)+c(n LD) +: 
+ ((n — 1c(n — 1,k) te(n —1,k — D)z* +--+ + e(n — Ln - 12" 


=c(n,1)z+c(n,z)27 +--+ cM, +--+. + c(n,n)z". 


Thus, if one takes c(n,0) = 0, and c(n,m) = O for m > n, then we arrive at the 
relations 


c(n,k) =c(n—1,k -—1)+m—Ictn—1,k), k=1,2,...0N. (10.68) 


We thus perceive that Stirling must have known this relationship, though he did not 
explicitly mention it. Moreover, a similar recurrence relation holds for S(n,m) with 
n and m positive integers and under the conditions $(n,0) = 0 and S(n,m) = 0 for 
m>n: 


S(n + 1,m) = S(nzm — 1) +mS(n,m). (10.69) 


We follow Tweddle’s argument to verify (10.69), given in his notes on Stirling’s 
work: Observe from (10.63) that S(m + 1,m) is the coefficient of 1 in the expansion of 


zal 


ae = 1)- Gm) 


in powers of z and i This is the same coefficient as that of 5 in the expansion of 


gh 


z2(z—1):+:(z—m) 


We see from (10.64), with b(n,m) replaced by S(n,m), that two terms, expanded 
in powers of i, will produce me 


S(n,m — 1) _ S(n,m) 
(Z—m+1)\(¢—m)' z—m 


S(n,m — 1) es ae m\~! — S(nym) m\—! 

aa ee hae a 
z “a Zz ra & 

= SO i (ie Se) Oa Se), 
Zz z & Pa z 


52 Stirling and Tweddle (2003) p. 171. 
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The coefficient of 5 is thus 
S(n,m — 1) +mS(n,m) 


and (10.69) is proved. Tweddle noted>? that “Apparently Stirling did not notice this 
fact which ... allows us to construct the table much more easily.” 

It is possible to extend the definition of Stirling numbers to negative integers, but 
when this is done for Stirling numbers of the first kind, the result is Stirling numbers of 
the second kind and vice versa. Interestingly, we will thus see that there is but one kind 
of Stirling numbers. Stirling was perhaps aware of this fact. The numbers c(n,m) can 
be defined for positive as well as negative m and n by the initial conditions c(0,0) = 1, 
c(n,0) = c(0,m) = 0 when n ¥ 0, m ¥ 0 and the recurrence relation: 


c(n,m) = c(n—1,m—1)+(n—1)c(n— I,m). (10.70) 


Note that (10.70) and (10.68) actually represent the same recurrence relation, but 
with different initial conditions; in (10.70) m and n can take any integer values. 

In the same manner, S(n,m) can be defined for all integers m and n by the initial 
conditions $(0,0) = 1, $(n,0) = S(0,m) = 0, whenn #4 0, m $ O, and the recurrence 
relation 


S(n,m) = S(n — 1,m — 1) + mS(n — 1,m). (10.71) 


Again, observe that (10.71) and (10.69) are actually identical, except that their 
initial conditions are different; (10.71) is defined for all integers m and n. 

We are now in a position to prove that the two kinds of Stirling numbers can be 
reduced to only one kind: 


S(n,m) = c(—m, — n). (10.72) 


Observe that by (10.70), c (—m, — n) satisfies the recurrence relation 


c(-—m, —n) =c(-—m—1, —n—-—1)+ (—m-—1)c(—m-—1, —n). (10.73) 
Let a(n,m) = c (—m, — n) so that (10.73) can be written as 
a(n+1m+1)=a(n,m)+ (m+ la(n,m+1); 
and by replacing n + 1 by n and m + 1 by m we get 
a(n,m) =a(n—1,m—1)+m-a(n—1,m). (10.74) 


Therefore, a(n,m) satisfies the same recurrence relation as S(n,m) and has 
the same initial conditions; we can conclude that a(n,m) = S(n,m) and (10.72) 
is proved. 


53 ibid. p. 171. 
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In his paper “Two notes on notation,’ Donald Knuth wrote, “... a rereading of 
Stirling’s original treatment makes it clear that Stirling himself would not have found 
the duality law [(10.72)] at all surprising. From the very beginning, he thought of the 
numbers as two triangles hooked together in tandem.” Knuth’s paper presents an 
interesting history of this duality law. 

We also remark that Tweddle* has explained that Stirling, in an unpublished work, 
revealed that he was aware of a relation between Bernoulli numbers and Stirling 
numbers of the second kind. In his unpublished notes, Stirling referred to a formula of 
de Moivre:>° 


z2m+1 


CO CO 

1 i 4 B 
2 ee ee (10.75) 
( k=1 


Now de Moivre presented (10.75) in his Supplement to his 1730 Miscellanea 
Analytica. In the same Supplement, he discussed a method for calculating the 
Bernoulli numbers that was essentially the same as Jakob Bernoulli’s own method. 
We discuss this formula in Chapter 18. But note that if Stirling’s (10.61) is applied to 
his (10.54), the result is 


DD (z a: 


k=0 


= S(n,k) 
=> 4c if > aH (10.76) 


k=0 


Comparison of (10.75) and (10.76) reveals a relation between Bernoulli numbers 
and Stirling numbers of the second kind: 


(—D*k! 
By, = yous a S(n,k), n>2. (10.77) 
k=1 
Note that when n is odd, B, = 0, so that terms with even powers of z in the 


denominators in the sum on the right-hand side of (10.75) do not appear. Now Stirling 
did not give formula (10.77), but rather remarked in connection with de Moivre’s 
calculations that Bernoulli numbers could more easily be calculated by using Stirling 
numbers of the second kind. 

The significance of the Stirling numbers was not fully realized until the twen- 
tieth century when they became very useful in combinatorics. In his 1939 book 
on the calculus of finite differences, the Hungarian mathematician Charles Jordan 
(1871-1959) wrote,>’ “Since Stirling’s numbers are as important as Bernoulli’s, 
or even more so, they should occupy a central position in the Calculus of Finite 
Differences. The demonstration of a great number of formulae is considerably 
shortened by using these numbers, and many new formulae are obtained by their aid; 
for instance, those which express differences by derivatives or vice versa.” 


54 Knuth (1992). 

55 Tweddle (1988) pp. 15-16. 
56 de Moivre (1730b) p. 19. 
57 Jordan (1979) pp. Vii—viii. 
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10.8 Lagrange’s Proof of Wilson’s Theorem 


Lagrange was the first mathematician to investigate the arithmetical properties of 
Stirling numbers and he did so in the process of proving Wilson’s theorem. This 
proposition, also found and published by al-Haytham in the tenth or eleventh 
century,>® named after Edward Waring’s best friend John Wilson (1741-1793), states 
that form > 1,(n — 1)!+1 is divisible by n if and only if is prime. The statement of 
Wilson’s theorem was also published in Waring’s Meditationes Algebraicae of 1770.° 
Waring was certain of the truth of the theorem but was unable to prove it. Lagrange 
provided a proof,” using Stirling numbers, in 1771. For this purpose, he considered 
the product 


Gb DG 2) + 3) A) en 1) 
= yr i a A’x?-2 we A’x?-3 a Al x4 hg th Aah) 
We can see that the coefficients A’, A”, A’””,... A” are in fact absolute values of 


Stirling numbers of the first kind, though Lagrange did not mention Stirling. Lagrange 
replaced x by x + 1 to get 


(x+2)(4+3)@+4)@4+5)---@+n) 


Sey a ely A Gee a GD ee Ae 


It was then easy to see that 


Gana RA A A ee Ar) 
= (x 1)” A(x i) A(x Lyre A" (x ye? 


te fA VY + 1). 
Expanding both sides of this equation in powers of x, Lagrange obtained 


x" + (n+ Ax®! + Al + AM x8? + AY + AM) xt + 


=x" + (n+ Ax"! + (4 =A a’) x? 
n(n — 1)(n — 2) (na Dt S25 my am\ sn-3 1, 
r( _ | , AEG Da" bal") zs 


58 Rashed (1980). 
59 Waring (1991). 
60 Lagrange (1771). 
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He next equated the coefficients on both sides to get recurrence relations for the 
Stirling numbers of the first kind: 


n+A’=n+A', 
ae 
(LAS “ em ny ee 
Ale hilt n(n — 1)(n — 2) | = DOS 2) DAM LAM oo, 
2.3 2 
or 
aoe 
Ate oe, (10.78) 
rh ee ee 
nara he :. ye dal (10.79) 
ag See eH 2Ge®) | =D 2) 8) iy 
re ee | 2-3 
—2)(n — 
ue = ay eee (10.80) 
SDA SA PA AU ee Ay, (10.81) 


Lagrange noted that if n were an odd prime p, then the equation (10.78) showed 
that A’ was divisible by p; (10.79) showed A” divisible by p, and so on. The 
equation that would immediately precede (10.81) implied that p divided A“’~”). Next, 
observing that A“—) = (n—1)! = (p—1)!, Lagrange perceived that equation (10.81) 
implied Wilson’s theorem that A“~)) + 1 = (p — 1)!+1 was divisible by p. As 
an application of this theorem, Lagrange determined the quadratic character of —1 
modulo a prime p. That is, he deduced that if p were a prime of the form 4n + 1, then 
there had to exist an integer x such that x? + 1 was divisible by p. Note that Euler 
had given a remarkable proof of this result using repeated differences of the sequence 
1”, 2”, 37,4", ....©! See Exercise 13 at the end of this chapter. 


10.9 Taylor’s Summation by Parts 


The method of summation by parts is usually attributed to Abel who used it in a 
rigorous discussion of convergence of series. However, a century earlier, in the 1717 
Philosophical Transactions, Taylor explicitly worked out this idea as an analog of 
integration by parts. Moreover, one can see in the work of Newton, Leibniz, and others 
that they were implicitly aware of this method. Abel’s summation by parts method 
consists in moving from (4.25) to (4.26) and this may be compared with the steps 
Newton took in (10.17). 


61 Buss (1968) vol. I, pp. 493-497. See also Eu. I-2, section 5, pp. 328-337. E 242. 
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Taylor’s result is actually an indefinite summation formula in which the constants 
of summation are not explicitly written. Because Taylor’s presentation is obscure, 
we present the 1819 derivation below from Lacroix in which }° and A were taken 
as inverse operations. In Lacroix’s notation, P and Q were functions of an integer 
variable x and P; = P+ AP, P) = P; + AP}, etc. In this notation, Taylor’s formula 
took the form 


dPa=0> P-y-(aoyA). (10.82) 


To prove this following Lacroix,” first suppose that 


MePC= 0) Paes 


Apply the difference operator A to both sides to get 


A()\Pe)=a(e> P +z), 
or 
PQ=(Q+AQ))\(P+AP)-O)>P+hz 
=Q)°AP+AQ)\(P+AP)+ Az 
=QP+AQ) \(P+AP)+ dz. 


Hence, 


Az =—AQ) (P+AP) orz=-)> (aoe 4 AP)) = -(aey> Pi), 


and Lacroix’s proof of Taylor’s formula (10.82) was complete. In his book, Lacroix 
attributed the result to Taylor. 

Just as one can perform repeated integration by parts, one may also do repeated 
summation by parts if necessary. Thus, Lacroix gave this formula for repeated 
summation by parts: 


y\PQ=Q0) P- Ay. Pi + 20> P, — wQy P3+ etc., (10.83) 


where ar ee ,... denoted double, triple, ... summation. To derive this formula, he 
replaced Q by AQ and P by )> P; in (10.82). He then had 


» (4e>>41) = AOS Pi -y (“oy rs) 


62 Tacroix (1819) pp. 91-92 or Lacroix (1800) pp. 89-90. 
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Combining this with (10.82), he obtained 


YPa=0y P-s0y +d (soy A). 


A continuation of this process would yield formula (10.83). 


10.10 Exercises 


(1) Show that for a finite sequence of positive decreasing numbers ao, a1, ...,dn 
n—-1 
ay = > (aj — 4:41) + an. 
i=0 


The sequence can be infinite; in that case “the last number’ of the sequence 
dy, should be replaced by the limit of the sequence. Then deduce the sum of 
the convergent infinite geometric series. For a reference to this 1644 result of 
Torricelli, see Weil (1989b). 

(2) Use the inequality 


to prove that 


De BAS 


diverges. Apply Torricelli’s formula in the previous problem to sum 


See Weil (1989b) for the reference to these 1650 results of Mengoli. 
(3) Show that 


Leibniz also mentioned this result in his Historia et Origo of 1714, written 
in connection with the calculus controversy, where he explained that in 
a 1682 article in the Acta Eruditorum, he had extended the inverse rela- 
tionship between differences and sums to differentials and integrals. See 
Leibniz (1971) vol. 5, p. 122. 

(4) Find the sums of the reciprocals of the figurate numbers. For example, the 
sum of the reciprocals of the pyramidal numbers is given as 
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i A 16 OD. 35 


See Jakob Bernoulli (1993-99) vol. 4, p. 66. 
(5) Show that if 


1 1 1 1 1 1 
x= and y = : 
then 
x 
A= y => 3m * 


See Jakob Bernoulli (1993-99) vol. 4, p. 74. 
(6) Let m, n, and p be integers and x = a + mn. Show that 


S Vat kny(at (k+ Dn)-+- (a+ (k + p= Wn) 


k=0 
_ XQ +n)--- (+ pn) —(a—n)ala+n)--- (a+ (p— Dn) 
(p+1)n ; 
Deduce the values of the sums 
14+24+3+4+4+4+---4+x, 


1+3+6+10+--:- tox terms, 


1-3-543-5-7+5-7:-9+--- tox terms. 


This is the first result in Montmort’s (1717) paper De Seriebus Infinitis 
Tractatus. He noted that this result was a generalization of a sum in Taylor’s 
Methodus Incrementorum. 


(7) Sum the series 


5 41 


3-5-7-9-11-13 5-7-9-11-13-15 
131 i, RT |). Ae 


7-9-11-13-15-17  9---19 11---21 


Montmort (1717) computed the sum to be ao 
(8) Prove Taylor’s summation formula (10.12) by an application of the Harriot— 
Briggs, usually known as the Gregory—Newton, formula. 
(9) Find the values of A, B, ..., E to obtain the sum of Nicole’s series (10.47). 
(10) Derive Gregory’s formula (10.1) or (10.2) from the Newton—Gauss 
interpolation. 
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(11) Use Wilson’s theorem to prove that if p = 4n + 1 is a prime, there exists 
an integer x such that p divides x* + 1. See Lagrange (1867-1892) vol. 3, 


pp. 425-438. 

(12) Prove Wilson’s theorem by using Stirling numbers of the first kind and 
Fermat’s little theorem that z?~'! = 1 (mod p) when p is prime and a is 
an integer not divisible by p: 
(a) Let 


(x — D(x —2)---(e — p+) 
= xP} $f AyxP? + AyxPF 4-0 + Ap-i, 


and Ag = 1 + Ap_). Observe that ge be Ap—1 = Ao (mod p) for 
x =1,2,...,p —1. Prove that 


Ao + kP-7 Ay + kP3 Ay +++» + kKAp—2 = 0 (mod p) 


fork =1,2,...,p—1. 

(b) Show that the determinant of the system of p — 1 equations in 
Ao, Al, ...,Ap—z2 has a nonzero determinant modulo p. 

(c) Deduce that Ap = 0, A; = 0,...,Ap—2 = O (mod p). Sylvester 
published this result in 1854. See Sylvester (1973) vol. 2, p. 10. 


(13) Show that, given a prime p = 4n + 1, the 2nth difference of the sequence 


is not divisible by p. Deduce that a2" — 1 is not divisible by p foralll <a < 
p — 1. For this result of Euler, see his correspondence in Fuss (1968) p. 494 
and Eu. I-2 section 5, pp. 328-337 (E 242). See also Weil (1984) p. 65, and 
Edwards (1977) p. 47. 


10.11 Notes on the Literature 


A brief summary of David Gregory’s Exercitatio by Whiteside can be found in 
Newton (1967-1981) vol. IV, pp. 414-417. See Tweedie (1917-1918) for an eval- 
uation of Nicole’s researches in the calculus of finite differences. Whittaker may 
have been the first to notice the connection of proposition 7 of Stirling’s Metho- 
dus Differentialis with the transformation of hypergeometric series. See p. 286 of 
Whittaker and Watson (1927). For an English translation by J. D. Blanton of the 
first part of Euler’s 1755 book on differential calculus, see Euler (2000). An English 
translation of Fuss (1968) vol. 1, may be found in Correspondence of Leonhard Euler 
with Christian Goldbach, Part 2, published by Springer in 2015. Part 1 contains the 
untranslated letters. 
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The Taylor Series 


11.1 Preliminary Remarks 


In 1715, Brook Taylor (1685-1731) published one of the most basic results in the 
theory of infinite series, now known as Taylor’s formula. Taylor published his formula 
fifty years after Newton’s seminal work on series and twenty-five years after Newton 
discovered, but did not publish, this same formula. In modern notation, Taylor’s 
formula takes the form 


at nC pt © — 


f(x) = fat & 7 rT 


(11.1) 


Taylor communicated this result to John Machin in a letter of July 26, 1712.'! We 
note here that in 1706, Machin calculated z to 100 digits by employing the formula 


a 1 1 
— = 4arctan — — arctan —_. 
4 5 239 


In his letter to Machin, Taylor wrote 


I fell into a general method of applying Dr. Halley’s Extraction of roots to all Problems, wherein 
the Abscissa is required, the Area being given which, for the service that it may be of calculations, 
(the only true use of all corrections) I cannot conceal. And it is comprehended in this Theorem. 
... If w be any compound of the powers of z and given quantities whether by a finite or infinite 
expression rational or surd. And be the like compound of p and the same coefficients, and 
Z= p+x,and p = 1 =z. Then will 


! Bateman (1907). 
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Where a, a, & &c. are formed in the same manner of z and the given quantities, as B, B, &c. 
are formed of p. &c. So that having given a, f, and one of the abscissae z or p, the other may 
be found by extracting x, their difference, out of this aequation. Or you may apply this to the 
invention of a or f, having given z, p and x. But you will easily see the uses of this. 


Newton discovered and extensively used infinite series in the period 1664-1670, 
but during that time it does not appear that he observed the connection between the 
derivatives of a function and the coefficients of its series expansion. This connection is 
the essence of the Taylor series, and it can be applied to obtain power series expansions 
of many functions. Newton discovered infinite series before he had investigated the 
concept of derivatives. Thus, he found expansions for several functions by using 
his binomial expansion, term-by-term integration, and reversion of series. He first 
indicated his awareness of the connection between derivatives and coefficients in 
his 1687 Principia, wherein he expanded (e? — 2ao — 0)2 in powers of o and 
interpreted the coefficients as geometric quantities directly related to the derivatives 
of the function.” It is very possible that Newton was aware of the Taylor expansion 
at this time. In fact, forty years later, this Principia result inspired James Stirling to 
consider whether it could be generalized, leading him to an independent discovery of 
the Maclaurin series, published in 1717 in the Philosophical Transactions.* 

In 1691-92, Newton gave an explicit statement of Taylor’s formula as well as 
the particular case now named for Maclaurin. These appear in his De Quadratura 
Curvarum, composed in the winter of 1691-92 and never fully completed; in 1704 
parts of this text were published under the title Tractatus de Quadratura Curvarum. 
Unfortunately, the published portions omitted the Taylor and Maclaurin theorems. 
As we shall see, Gregory used this result in 1670 to construct series for numerous 
functions, but Newton was the first to give its clear, though unpublished, statement. 
In his De Quadratura, finally published in 1976 by Whiteside, Newton discussed 
the problem of solving algebraic differential equations by means of infinite series. 
In this context, he stated the Taylor and Maclaurin expansions and then wrote the 
word “Example” and left a blank space. Apparently, he intended to give a solution 
of an algebraic differential equation but could not hit upon a satisfactory example. 
According to Whiteside, Newton’s worksheets from this period show that he made 
several attempts to solve the equation ,/1 + y2 x }) = nj without complete success.* 
This may explain why he did not include these results in the published work, although 
his corollaries three and four contain Newton’s own formulation of the Taylor and 
Maclaurin series:> 


Corollary 3. Hence, indeed, if the series proves to be of this form 
y=az+ be? 4+ cz 4+dz4 + ez + etc. 


(where any of the terms a, b, c, d etc. can either be lacking or be negative), the fluxions of y, 
when z vanishes, are had by setting /Z = a, )\/2? = 2b, ¥/23 = 6c, ¥/z+ = 24d, y/2> = 120e. 


2 See Proposition X of the second book of the Principia. 
3 Stirling (1717). 

4 Newton (1967-1981) vol. 7, p. 99, footnote 111. 

5 ibid. pp. 97-99. 
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Corollary 4. And hence if in the equation to be resolved there be written w + x for z, as in Case 3, 
and by resolving the equation there should result the series [y =]ex + f x2 gx? + hx*+ etc. 
the fluxions of y for any assumed magnitude of z whatever will be obtained in finite equations by 
setting x = 0 and so z = w. For the equations of this sort gathered by the previous Corollary will 
be 9/z = e, 3/2 = 2f, 3/23 = 6g, ¥/z4 = 24h ete. 


Even before Newton, the Scottish mathematician James Gregory discovered and 
used Maclaurin’s series to obtain power series expansions of some fairly complicated 
functions. In a letter to John Collins dated February 15, 1671,° Gregory gave the 
series expansions for arctanx, tanx, secx, Insecx, Intan ( i + s), arcsec(/2e* ), 
2 arctan (tanh (5)). Of the seven series, the first two are inverses of each other, as 
are the fifth and the seventh; the fourth and sixth are inverses of each other, except 
for a constant factor. It does not seem that Gregory applied reversion of series, a 
technique used by Newton, to obtain the inverses. On the back of a letter from Shaw,’ 
Gregory noted the first few derivatives of some of the seven functions. From this, 
we can see that Gregory derived the series for the second, third, sixth, and seventh 
functions by taking the derivatives; the series for the fourth and fifth using term-by- 
term integration of the series for the second and third functions; and the series for 
arctan x by integration of the series for ~. As we shall see below, a key mistake in 
Gregory’s calculations gives us evidence that he used the derivatives of a function to 
find its series. 

Gregory knew that Newton had made remarkable advances in the theory of series, 
though he had seen only one example of Newton’s work. He concluded that Newton 
must have known Taylor’s expansion, since that could be used to find the power 
series expansion of any known function. Before giving his seven series expansions 
in his letter to Collins, Gregory wrote, “As for Mr. Newton’s universal method, 
I imagine I have some knowledge of it, both as to geometrik and mechanick curves, 
however I thank you for the series [of Newton] ye sent me, and send you these 
following in requital.”® 

In 1694, Johann Bernoulli published a result in the Acta Eruditorum equivalent to 
Taylor’s formula,” though it did not as easily produce the series expansion: 


2 3 4 3 
z- dn Zz ddn Zz d°’n 
dz=nz + etc. 11.2 
[ra VEST Bag L088 dee Lada ee 


He used this to solve first-order differential equations by infinite series and also 
applied it to find the series for sin x and In(a + x) and to solve de Beaune’s problem 
concerning the curve whose subtangent remained a constant. Unfortunately, Bernoulli 
could not use this formula to derive the series for sin x; he obtained only a ratio of two 
series for ——— where y = asinx. He commented that, though this method had 


this drawback, it was commendable for its universality. Bernoulli communicated his 


6 Turnbull (1939) pp. 168-176. 

7 Turnbull (1939) pp. 350-353. Thanks to the gracious help of the librarians at St. Andrews University, the 
author was able to view and obtain a copy of the original letter and Gregory’s notes. 

8 ibid. p. 170. 

9 Bernoulli, Johann (1968) vol. 1, pp. 125-128. 
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series to Leibniz in a letter of September 2, 1694,!° before the paper was printed. In 
reply, Leibniz observed that he had obtained similar results almost two decades earlier 
by using the method of differences of varying orders. He gave a detailed exposition 
of how that method would produce Bernoulli’s series. We note that in 1704, Abraham 
de Moivre published an alternative proof of Bernoulli’s series and four years later he 
communicated this to Bernoulli.!! 
It seems very likely that Gregory derived the Taylor expansion from the Gregory— 
Newton, or rather, the Harriot—Briggs, interpolation formula: 
(x 


=e) 
f@) = f@4 i Af (a) 4 


(x —a)(x —a 


—h 
we r@ te, 


where 


Af (a) = f(a+h)— fa), A’ f(a) = flat 2h) —2f@t+h)t f@), .... 
As h — 0, the number of interpolating points tends to infinity, and 


A? f(a) 
h2 


Af(@) 
h 


> f'(a), > f"(a),.... 

The resulting series is Taylor’s expansion. This proof is not rigorous, but the same 
argument was given by de Moivre in his letter to Bernoulli and then again by Taylor 
in his Methodus Incrementorum Directa et Inversa of 1715.'* Leibniz too started 
with a formula involving finite differences to derive Bernoulli’s series. On the other 
hand, the unpublished argument of Newton, also independently found by Stirling and 
Maclaurin, assumed that the function had a series expansion and then, by repeated 
differentiation of the equation, showed that the coefficients of the series were the 
derivatives of the function computed at specific values. This is called the method of 
undetermined coefficients. 

We can see that there were three different methods by which the early researchers 
on the Taylor series discovered the expansion: (a) the method of taking the limit of 
an appropriate finite difference formula, by Gregory, Leibniz, de Moivre, and Taylor, 
(b) the method of undetermined coefficients, by Newton, Stirling, and Maclaurin, and 
(c) repeated integration by parts, or, equivalently, repeated use of the product rule, by 
Johann Bernoulli. 

Infinite series, including power series, were used extensively in the eighteenth 
century for numerical calculations. On the basis of considerable experience, mathe- 
maticians usually had a good idea of the accuracy of their results even though they did 
not perform error analyses. It was only in the second half of the eighteenth century that 
a few mathematicians started considering an explicit error term. In the specific case 
of a binomial series, Jean d’ Alembert (1717-1783) obtained bounds for the remainder 


10 Bernoulli and Leibniz (1745) vol. 1, pp. 13-16. 
1l See Feigenbaum (1981) chapter 4. 
12 Taylor (1715). 
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of the series after the first n terms.!3 In 1754, as reported by Lacroix,!* he also found 
a more general but not very useful method by which he expressed the remainder of 
a Taylor series using an iterative process, and when worked out, this would have 
resulted in an n-fold iterated integral. Surprisingly, in 1693 Newton proved a result 
that converted an iterated integral into a single integral. If d’ Alembert had used this, 
he would have obtained the remainder given in many textbooks; Lagrange, de Prony, 
and Laplace discovered this remainder term using a different method. 

In an undated manuscript determined by Whiteside to date from 1693,'° apparently 
written while he was revising De Quadratura, Newton worked out the nth fluent 
(integral) of y, the fluxion of y. The formula in modern notation takes the form 


1 z % a | z 
= al jdt net | ryan Dr f Pidiere 
n! 0 0 2 0 


zn zim zh-2 
| | | s 
PRR mth GO 


The expression without the polynomial part can be simplified by the binomial 


theorem so that we have 
1 & 
— | (z—t)"ydt. 
n! 0 


If we instead take the nth iterated integral of y, then this expression takes the form 


1 z 


Newton’s result is but one step away from Taylor’s formula with the remainder as an 
integral. Compare Newton’s result with Cauchy’s work on the equation vy = f(x), 
given later in this chapter. It is not clear whether Newton was aware that the Taylor 
series followed easily from his result, but he certainly revised his monograph in that 
connection. Thus, it is possible that Newton was aware of the relation between his 
integral and Taylor series. Interestingly, Newton included an equivalent of this result 
in geometric garb, but without proof, in his 1704 Tractatus. In 1727, Benjamin Robins 
published a proof in the Philosophical Transactions.'® 

Newton’s formula for the nth iterate of an integral appears to have escaped the 
notice of the continental mathematicians of the eighteenth century. Thus, it remained 
for Lagrange to discover an expression for the remainder term. This appeared in his 
Théorie des fonctions analytiques of 1797.'’ Owing to his algebraic conception of 
the calculus, Lagrange avoided the use of integrals in this work. So, to find bounds 
for the remainder, he wrote down an expression for its derivative. In a later work 
of 1801 entitled Legons sur les calcul des fonctions, he generalized the mean value 


13, Grabiner (1981) pp. 60-64. 

Lacroix (1819) pp. 396-398. Note interesting footnote on p. 396. 
!5 Newton (1967-1981) vol. 7, pp. 164-166. 

16 Robins (1727). 

'7 Lagrange (1797) pp. 44-45. 
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theorem and consequently obtained the well-known expression of the remainder as an 
nth derivative, now called the Lagrange remainder. '* He applied this to a discussion of 
the maximum or minimum of a function and also to his theory of the degree of contact 
between two curves. Without defining area, he also used the remainder to prove that 
the derivative of the area was the function itself. 

Though the integral form of the remainder followed immediately from his work, 
Lagrange never explicitly stated it. In his lectures of 1823, Cauchy wrote that in 
1805 Gaspard Riche de Prony (1755-1839) used integration by parts to obtain 
Taylor’s theorem with the integral remainder.'? De Prony was a noted mathematician 
of his time and taught at the Ecole Polytechnique. He is now remembered as a 
leader in the construction of mathematical tables. To fill the need for the numerous 
human calculators required for this process, de Prony gave training in arithmetic to 
many hairdressers, left unemployed by the French Revolution. Pierre-Simon Laplace 
(1749-1827) included a derivation of Taylor’s theorem with remainder, using integra- 
tion by parts, in the second edition of his famous Théorie analytique des probabilités, 
published in 1814. The third volume of Lacroix’s book on calculus, of 1819, referred 
to Laplace but not to de Prony;?? Cauchy may have mentioned de Prony in order to 
set the record straight. 

Lagrange’s derivation of the remainder had significant gaps, though his outline 
was essentially correct. He regarded it as well known and therefore did not provide 
a proof that functions — by which he meant continuous functions, though he did not 
define continuity — had the intermediate value property as well as the maximum value 
property on a closed interval. It was not until about 1817 that Bolzano and Cauchy 
gave a precise definition of the continuity of a function and proved the intermediate 
value property. Bolzano’s definition, similar to our modern definition, was that f(x) 
was continuous if the difference f(x + w) — f(x) could be made smaller than any 
given quantity, with w chosen as small as desired.”! 

Bernard Bolzano studied philosophy and mathematics at Charles University in 
Prague from 1796 through 1800. Although he did not particularly enjoy his math- 
ematics courses, he studied the work of Euler and Lagrange on his own. However, 
Bolzano was fully converted to mathematics by the study of Eudoxus in Eulcid’s 
Elements. Bolzano served as professor of theology at Prague from 1807 to 1819, but 
published several mathematics papers during this period, including his 1817 work on 
the intermediate value theorem. He based this theorem on his lemma that if a property 
M were true for all x less than u, but not for all x, then, among all values of u for 
which this was true, there existed a greatest, U. To prove this lemma, he applied a 
form of the Bolzano-Weierstrass theorem; he wrote that he had a proof of the latter 
theorem, though it has not been found among his papers. 

A prolific author of scientific and other works, during the 1830s Bolzano wrote a 
work on the foundations of real analysis, Functionenlehre, finally published in 1930.7? 


'8 Lagrange (1867-1892) vol. 10, pp. 94-95. 

19 Cauchy (1823) p. 142. 

20 Lacroix (1819) pp. xvi and 399. 

For an English translation of Bolzano’s 1817 paper, see Bolzano (1980). 
22 Bolzano (1930). 
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In this book, Bolzano proved the extreme value theorem, that states that a continuous 
function on a closed and bounded interval assumes its maximum/minimum value 
at some point in the interval. Note that eighteenth- and early nineteenth-century 
mathematicians did not feel a need to prove this theorem. In fact, as late as 1868, 
Serret’s differential calculus assumed this fact without comment.”? Bolzano proved 
the extreme value theorem in two steps. First, he proved that a continuous function on 
a closed and bounded interval has to be bounded; he then applied this result to verify 
that the function must assume the extreme values. 

Bolzano also conceived of the important idea of uniform continuity*+ much before 
other mathematicians. Uniform continuity is needed, for one example, to prove that a 
continuous function on [a, b] is Riemann integrable. Note that a function f(x) is said 
to be uniformly continuous on an interval J if for each € > 0 there exists a 5 > O such 
that | f (x1) — f(x2)| < €, when |x; — x2| < 6. In the 1830s, realizing that there was a 
need for a theory of real numbers, Bolzano made an unsuccessful attempt to develop it. 
However, Bolzano’s insight into real analysis was deep; he was the first mathematician 
to construct an example of a continuous nowhere differentiable function.?> About 
Bolzano, Abel wrote, amongst other doodles in his 1826 Paris notebook, “Bolzano 
is a clever fellow from what I have studied;”2° note that Abel’s comment was based 
only on Bolzano’s early work. 

Since Bolzano’s work on the extreme value theorem was not published until 1930, 
Weierstrass tackled it, and proved the result in his Berlin lectures during the 1860s. 
Though Weierstrass did not publish these lectures, his students referred to them in 
their publications. Thus, we give our translation of a passage from Georg Cantor’s 


1870 paper on trigonometric series:7” 


The proof [that Cantor gave in his paper] is essentially based on the frequently-occuring and 
proven theorem contained in the lectures of Mr. Weierstrass: 


“Given a real variable in an interval (a,b) (including the limits), a continuous function ¢(x) 
reaches the maximum value g that it can assume, for at least one value of of a variable, xg, so that 


o(x9) = 8” 


One similar proof based thereon for the fundamental theorem of differential calculus [the mean 
value theorem] was given by Ossian Bonnet; this can be found in “Cours de calcul différentiel et 
intégral, par J. A. Serret, Paris 1868” im ersten Bande, Seite 17-19. 


Moreover, David Hilbert in his 1897 lecture in memory of Karl Weierstrass, a 
translated portion of which is given below, mentioned the extreme value theorem:7° 


Of the highest importance, moreover, is the sharp distinction that Weierstrass made, as to whether 
at a point, a function reached a value or came only arbitrarily close, especially the distinction 
between the notion of a maximum or a minimum and the notion of the upper or lower limit of 


23 Serret (1868) vol. I, p. 17. 

24 Rusnock (2005). 

25. For a discussion of Bolzano’s example, see Strichartz (1995) pp. 403-406. 
26 Stubhaug (2000) p. 505. 

27 Cantor (1870b) p. 141, footnote. 

28 Hilbert (1897) p. 63. 
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a function of a real variable. In his theorem, according to which a continuous function of a real 
variable always actually reaches its upper and lower limits, that it necessarily has a maximum and 
minimum, Weierstrass discovered a result that no mathematician engaged in research in higher 
analysis or arithmetic can do without. 


In his lectures, Weierstrass also discussed the idea of uniform continuity. In fact, 
a proof of the theorem that a continuous function on a closed interval is uniformly 
continuous was published by E. Heine in 1872,”? who wrote that this proof and other 
proofs in his paper had been orally communicated to him by Weierstrass and his 
students Schwarz and Cantor. 

In modern calculus books, the remainder term for the Taylor series of a function 
J (x) is used to determine the values of x for which the series represents the function. 
This approach is due to Cauchy; in his courses at the Ecole Polytechnique in the 1820s, 
he used this method to find series for the elementary functions. Cauchy’s use of the 
remainder term for this purpose was consistent with his pursuit of rigor; we also note 
that in 1822 he discovered and published the fact that all the derivatives at zero of 

1 


the function f(x) = e * when x # 0 and f(0) = O were equal to zero. Thus, 
the Taylor series at x = O was identically zero; it therefore represented the function 
only at x = 0. This example would have come as a great surprise to Lagrange who 
believed that all functions could be represented as series and even attempted to prove 
it. He built the whole theory of calculus on this basis. He defined the derivative of 
J (x), for example, as the coefficient of h in the series expansion of f(x + h). He was 
thereby attempting to eliminate vague concepts such as fluxions, infinitesimals, and 
limits in order to reduce all computations to the algebraic analysis of finite quantities. 
Cauchy, by contrast, rejected Lagrange’s foundations for analysis but accepted with 
small changes some of Lagrange’s proofs. 

The proof of Taylor’s theorem based on Rolle’s theorem, now commonly given in 
textbooks, seems to have first been given in J .A. Serret’s 1868 text on calculus;*° he 
attributed the result to Pierre Ossian Bonnet (1819-1892). In fact, Rolle proved the 
theorem only for polynomials. Serret did not mention Rolle explicitly in the course of 
his proof, but did mention him in his algebra book.*! Michel Rolle (1652-1719) was 
a paid member of the Academy of Sciences of Paris. In books published in 1690 and 
1691,>? Rolle established that the derivative of a polynomial f(x) had a zero between 
two successive real zeros of f(x). Since he did not initially accept the validity of 
calculus, Rolle worked out an algebraic procedure called the method of cascades, by 
which one could obtain the derivative of a polynomial. Euler, Lagrange, and Ruffini 
made mention of Rolle’s result, but it did not occupy a central place in calculus at that 
time because it was seen as a theorem about polynomials. Once it was extended to all 
differentiable functions, its significance was greatly increased. 


29 Heine (1872). 

30. Serret (1868) pp. 17-19. 

31 Serret (1877) p. 271. 

32 Rolle (1690) and (1691). For an English translation of relevant passages, see Smith (1959) vol. 1, 
pp. 253-260. 
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The modern conditions for the validity of Rolle’s theorem were given in sub- 
stance by Bonnet, but they were more carefully and exactly stated by the Italian 
mathematician, Ulisse Dini (1845-1918) in his lectures at the University of Pisa in 
1871-1872.°> After this, mathematicians began investigating the consequences of 
relaxing the conditions. In the exercises, we state a 1909 result of W. H. Young 
and Grace C. Young, using left-hand and right-hand derivatives. Grace Chisholm 
(1868-1944) studied at Girton College, Cambridge and then went on to receive a 
doctorate in mathematics from the University of Géttingen in 1896. Her best work was 
done in real variables theory and she was among the very few women mathematicians 
of her generation with an international reputation. William Young (1863-1942) 
studied at Cambridge and became a mathematical coach there. He coached his future 
wife for the Tripos exam and took up mathematical research after their marriage 
in 1896. He published over 200 papers and was one of the most profound English 
mathematicians of the early twentieth century. It appears from a letter W. H. Young 
wrote to his wife that several papers published under his name alone were in fact joint 
efforts. In recognition of this, a volume of their selected papers was published in 2000 
under both names.*4 

Returning to Bolzano and Cauchy’s proofs of the intermediate value theorem, we 
note that they both had gaps. Bolzano assumed the existence of a least upper bound 
and Cauchy’s argument produced a sequence of real numbers a,, n = 1,2,3,... such 
that dn41 — a, = 5 (An — Gy,—1); he assumed that such a sequence must have a limit. 
A theory of real numbers was required to shore up these proofs. Although it seems 
that by the 1830s, Bolzano had begun to understand the basic problem here,*> it was 
not until the second half of the nineteenth century that mathematicians were able to 
construct the necessary framework. Richard Dedekind (1831-1916) was one of the 
first to develop it and he described his motivation in his famous paper on the theory of 
real numbers:*° 


As professor in the Polytechnique School in Zurich I found myself for the first time obliged to 
lecture upon the elements of the differential calculus and felt more keenly than ever before the 
lack of a really scientific foundation for arithmetic. In discussing the notion of the approach of 
a variable magnitude to a fixed limiting value, and especially in proving the theorem that every 
magnitude which grows continually, but not beyond all limits, must certainly approach a limiting 
value, I had recourse to geometric evidences. Even now such resort to geometric intuition in a 
first presentation of the differential calculus, I regard as exceedingly useful, from the didactic 
standpoint, and indeed indispensable, if one does not wish to lose too much time. But that this 
form of introduction into the differential calculus can make no claim to being scientific, no one 
will deny. For myself this feeling of dissatisfaction was so overpowering that I made the fixed 
resolve to keep meditating on the question till I should find a purely arithmetic and perfectly 
rigorous foundation for the principles of infinitesimal calculus. 


33 Dini (1878) p. 71. 
34 Young and Young (2000). 
35. Rootselaar (1964). 
36 Dedekind (1963) pp. 1-2. 
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Dedekind published his theory in 1872, though he had completed it by Novem- 
ber 1858. Meanwhile, by 1872, the theories of Charles Méray,>’ Cantor,®> and 
Eduard Heine,*? equivalent to Dedekind’s, were published. Though he had discovered 
it some years before, Weierstrass presented his own independently developed theory 
of real numbers as part of his lectures in Berlin during the 1860s.*° 


11.2 Gregory’s Discovery of the Taylor Series 


In 1671, Gregory gave power series expansions of the seven functions mentioned 
earlier. His notation was naturally different from the one we now use. For example, he 
described the series for tan x and In sec x: 


If radius = r, arcus = a, tangus = f, secans artificialis = s, then 

a 2a 17a? 3233a? 
+ + + + 

3r2, 15r4315r6 —181440r8 
a at a® 17a8 3233a!0 


s= + + + + + ete. 
2r  -12r3, 4575 2520r7 ~~: 181440079 


etc., (11.4) 


t=a 


Gregory’s descriptions of the In tan (3 + a) and arcsec(/2e*) functions were 


slightly more complicated. In his letter to Collins, he gave no indication of how he 
obtained his seven series, but H. W. Turnbull determined that, except for the series 
for arctan x, Gregory obtained them by using their derivatives.4! While examining 
Gregory’s unpublished notes in the 1930s, Turnbull noticed that Gregory had written 
the successive derivatives of some trigonometric and logarithmic functions on the back 
of a letter, dated January 29, 1671, from Gideon Shaw, an Edinburgh stationer. For 
example, he gave the first seven derivatives of r tan with respect to 6 expressed 


as polynomials in tand = q. He denoted the function and its derivatives by m so 
that he had 
1st m= 4q, 
2 
24 m=r+ eae 
Fr 
2g 
3rd =? 2q ae a 
; 
8q?  6q4 
4th m =I 4-4 4 ze 
r r 
40q3 249° 


th _ 
5 ama Be BT a 


37 Méray (1869). 

38° Cantor (1872). 

39 Heine (1872). 

40 Dugac (1973) pp. 57-59. See also Snow (2003). 
41 Turnbull (1939) pp. 168-176. 
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136q*  240g*  1209° 
6 ie 1G ee ee 
r r r 
987q3 5 7 
1 m = 2724 +— > + 16804 + 720%, 
r r 
q° gq q® q® 
8" m = 272r + 32334 + 113614 + 134404 “3 + 5040 
r r r 


2 
Note that since the derivative of tan@ is sec?@ = 1+ or one can move from one 
value of m to the next by taking the derivative of the initial m with regard to q and 


then multiplying by r + a This suggests that Gregory used a method equivalent to 
the chain rule; indeed, this conclusion is supported by his computational mistake in 
finding the seventh m from the sixth: 


3 5 2 
2724 +9602. +7204 \(r +). 
Yr r3 r° Yr 


The coefficient of a is 272 + 960 = 1232, whereas Gregory had 987. Evidently, 
he had miscopied 272 ‘from the previous step as 27 to get 27 + 960 = 987. This in 


turn produced an error in the coefficient of L in the eighth m; this should be 3968 
instead of Gregory’s 3233. If this computation, with Gregory’s mistake, is continued 
to the tenth m, that is, the ninth derivative of r tan 0, then the first term of the derivative 
would be 2 x 3233 = 6466, as given in his notes. 

Note that the Maclaurin series for tan@ is obtained by computing the derivatives 
at 6 = 0. According to Gregory’s mistaken calculation, the coefficient a in the 
series for tan @ in the letter to Collins would be eae . This simplifies to 7 ia < 3 i just as 
Gregory noted in (11.4). As Turnbull pointed suit the appearance of this key error 
in Gregory’s letter fortunately allows us to see that the calculations on the back 
of Shaw’s letter were for the purpose of constructing the series. Thus, though no 
explicit statement of Maclaurin’s formula has been found in Gregory’s papers, we may 
conclude that Gregory was implicitly aware of it, since he made use of it in so many 
instances. 

In 1713, Newton, then president of the Royal Society, insisted that the society pub- 
lish relevant portions of Gregory’s letters to Collins in the Commercium Epistolicum to 
prove his own absolute priority in the discovery of the calculus. Recall that Gregory’s 
letters referred to the series of Newton communicated to him by Collins. But in the 
published accounts, Gregory’s computational error was corrected. 

Gregory found the series for arcsec(./2e*) by taking the derivatives of r@ with 
respect to Insec@. In his notes, he wrote down the first five derivatives employed 
to construct the series in the letter to Collins. If we write y = r6,x = Insec@, 
q =r tan@, then we have 


dy 1 dy _ r? ae dq r 
dx tanOd0 q dx qd 
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This implies that the successive derivatives can be found by taking the derivative 
with respect to g and multiplying by = +4: 


tc. 


d’y d {717 «dg Fo fg dy r 4r4 ; 3r6 
— * => ’ ’ e 
dx? dq\q/ dx qz\q- : dvs qg' gg 
Except for the signs of the derivatives, Gregory wrote precisely these expressions 
in his notes. He also wrote, without signs, expressions for the next two derivatives 
(without signs): 


2 4 6 8 2: 4 6 8 10 


r r r r r r 
+ 13 + 27 + 15 + 40, + 174 + 240 + 105-5. 
q q q q 


5 


1 


Gregory then expanded y = r (0 oe a) as a series in x = In @ 


about x = 0. 
Now when x = 0, then 0 = i and g = r. Hence, he had the series, given in modern 
notation: 

1 Si Aa Aa TAK: AS RKO 


SQ eae es ve Ag 


To see how the constants in this series are produced, consider the coefficient of x? 


obtained from the third derivative with g = r. From the preceding expression for ey. 
this value can be given as r+4r+3r = 8r, and since this has to be divided by 3! = 6, 


we atrive at + or simply Fafter dividing by r. 


11.3. Newton: An Iterated Integral as a Single Integral 


Newton wrote up his evaluation of the nth iterated integral as a single integral 
sometime around 1693, but did not publish it. His main idea was to repeatedly use 
integration by parts to reduce a double integral to a single integral.4” We reproduce 
his derivation, though we change his notation. He used the letters A, B, C,... to 
denote areas under j, zy, z*y,... but we shall use these letters to denote the areas 
under y, zy, z7y,... to obtain the result in standard form. We also employ the Fourier- 
Leibniz notation of a definite integral to denote area. Let y = f(t) be a curve, and 
let A = fj ydt, B = fo tydt, C = fy t?ydt, D = Jj tydt,.... Then for some 
constant a 


z 
i ydt=A+g, 
a 


where g is aconstant. The second iterated integral of y is 
z z dA 
[ateat=za fe dt+gz+h, 
a q at 


42 Newton (1967-1981) vol. 7, pp. 164-166. 
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where h is some constant. Integration of this expression gives 


1 “(1 ,dA dB 1 
eA zB / ( 1? t ja + —g2* +hz+io 
a 


2 dt dt 2 
ies z/1dC dC i 3 . 
=-7A-zB dt + =g2° +hz 4 
el [G dt =) pee 
ae B4 Daal 24h | 7 
Se va "5 r 588% rAZ +i. 


This is the third iterated integral of y. The integral of its first three terms is 


(1 34-2B42C sf gee ae tant 
4 + Z + constan 
BN ge Se ers re Pear dt | dt 


1 1 f%1dD 
— (34 32°B 32C) ; dt + constant 
a 


6 2. 3 dt 
1 
= 6 (34 32°B +3z2C D) + constant. 


Hence the fourth iterated integral is 


1 1 
r (34 32°B +3zC D) | gz | she tiz+k. 


Newton worked out another iterate to obtain 


1 (4 ae eed neva phe ite th re 5 
m4 (eA 42°B+ 62°C —4zD4+ E) + —gz° 4+ chz? 4+ <iz +kz+l. 
By induction, he wrote the general nth iterate, in our notation, as 


Gi DI (rtf yar —(n—- iat? Paya = a = Saf Pyar — ) 


I 1 n—-1 I 1 a (11.5) 
T Gai T (n — 2)! & T ‘ Z 


Newton left the integral in this form, although it is clear that he could easily have 
applied the binomial theorem to obtain the integral in the form (11.3). 


11.4 Bernoulli and Leibniz: A Form of the Taylor Series 


Johann Bernoulli’s 1794 result on series was stated in a paper*? and in a letter to 
Leibniz* as 
zzdn z?ddn zs dddn 


ieee = ane ! ie: 16 
WEN = aN og (des Pla 


43 Bernoulli, Johann (1968) vol. 1, pp. 125-128, especially p. 126. 
44 Teibniz and Bernoulli (1745) pp. 13-16. 
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Here Integr.n dz stood for the integral of n, or fn dz. In fact, the term “integral” 
was first used by the Bernoulli brothers, Jakob and Johann, who conceived of it as the 
antiderivative. In a letter to Johann, Leibniz once wrote that he preferred to think of 
the integral as a sum instead of as an antiderivative. Bernoulli’s proof of this result 
was very simple: 


zeddn —-zzddn z>dddn 
ndz WENO a eerie fae eos grades = 


He took the terms on the right in pairs to get 


4 3 
2 dn Zz ddn 
ndz = d(nz) (5%) (353) 


The required result followed upon integration. This process amounts to repeated 
integration by parts applied to / n dz. Bernoulli applied his formula to three questions: 
de Beaune’s problem; determination of the series for In(a + x); and the determination 
of the series for sinx. He was not completely successful with the third problem and 
was only able to find sina. as a ratio of two series. 

In reply to Bernoulli’s 1794 letter containing the result (11.6), Leibniz outlined his 
own derivation of the formula,*° instructive as an illustration of Leibniz’s conception 
of the analogy between finite and infinitesimal differences, leading to his characteristic 
approach to the calculus. We change Leibniz’s notation slightly in the initial part of his 
derivation; he himself used neither subscripts nor the difference operator. Supposing 
the sequence ao, a1, a2,... decreases to zero, Leibniz started with the equation 


ay = — (Aag + Aaj + Aaz+---). 
Since 


—1 
je HP Eig tA Da tects 


1 1-2 


Leibniz could rewrite the first equation as 


ay = —(Aag + Aag + A?ap + Aag + 2A7a9 + Atay +--+) 
= —(Aao(1 tl+1+4---)4 A?ag(1+2+3+-::)) 
— (AFan(1 +3 +64-+-) 4-55). 


He then observed that a similar relation continued to hold when the differences 
were infinitely small and he replaced ag, —Aao, A2a0, SG ee by y= y()), 
dy, ddy,d 3 y,... respectively; moreover, by letting the infinitely small dx become 1, 
he set 


45 ibid. pp. 18-24, especially pp. 21 and 22. 
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1+1+1+4+---=x, 


1+2+3 vos fix 
14346+1040-= ff sete 


Since 


Leibniz obtained 


1 dy 1 ddy 1. | cxd’y 1 yay 
=x xx } x” x e 
1 dx 1-2 dx? 1-2-3° dx3 1-2-3-4 dx4 


y tc. 


He then noted that Bernoulli’s formula followed upon replacing y, dy, ddy, etc. by 
J y, y, dy, etc., respectively. 


11.5 Taylor and Euler on the Taylor Series 


In Taylor’s book of 1715, he obtained his namesake series from the well-known 
interpolation formula by letting the distance between the equidistant points on the 
axis tend to zero. We shall follow Euler’s exposition of 1736,*° since Euler used a 
more convenient and easily understandable notation. Euler divided the interval from x 
to x + a into m equal parts, each equal to dx. He let y be a function of x and then let 
dy = y(x + dx) — y(x), ddy = y(x + 2dx) — 2y(x + dx) + y(x),... be the first, 
second,... differences at x. He then had 


y(x + 2dx) = y+2dy+ddy, y(x+3dx) = y +3dy + 3ddy + d°y, 


—] —l —2 
y(x +a) = y(x + mdx) = y +mdy 4 a addy 4 — ne dy + ete 


Next, he let m be an infinite number and dx infinitely small so that mdx was finite 
and equal to a. Then 


y(x +a) = y+mdy a aed = gay + etc. 
dy a’ ddy a dy 

=yra + etc. 
dx  1-2dx? 1.2-3dx3 


Gregory, de Moivre, and Taylor derived the Taylor series by means of essentially 
the same argument. 


46 Eu. I-14 pp. 108-123, especially pp. 108-110. E 47. 
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Euler then showed how Bernoulli’s series could be derived from Taylor’s formula. 
He set y(0) = 0 anda = —~x to get 


x dy x? ddy 4) ay 
0 + etc. 
ldx 1-2dx2 1-2-3dx3 
This implied 
x dy x? ddy x dey 
y= etc. 


~ tdx 1-2dx2 °° 1-2-3 dx3 


Euler then replaced y by { y dx, as Leibniz did, and obtained Bernoulli’s formula. See 
the exercises for the converse. 


11.6 Lacroix on D’Alembert’s Derivation of the Remainder 


In his 1754 book, Recherches sur différents points importants du systme du monde, 
Jean d’ Alembert obtained the n-dimensional iterated integral for the remainder in the 
Taylor series. In his 1819 book, Sylvestre Lacroix (1765-1843) presented the essence 
of d’Alembert’s proof in notation more familiar to us:*7 

Lacroix let u’ = u(x + h) and u = u(x) and set u’ = u + P. Then 


du’ _ dP 
dh dh’ 
and hence 
du’ 
P= | —dh. 
ice 


Note that the derivatives of wu’ are partial derivatives; for now, we follow Lacroix’s 
notation. Thus, he had 


du’ 
‘= — dh. 
u u+ ah 
Next, he let 
du’ _ du 10 
dh dx ‘ 
so that 
Qf 2,7 
d*u dQ 0 -|* u dh, 
dh? dh dh? 


47 Lacroix (1819) pp. 396-397. 
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du’ du ie [ean _ duh a : 
= dh, : 
dh dx dh? dh dx 1 Wie 
; duh ff | dul 
eh cea P| Aye 
Setting 
aus d*u 
R 
dh2 dx? ae 
he had 


du dR Bu! uw du du! 
— = ——_ R= eGR dh, — dh, 
“ i, dhe ae ae / 


tes duh | duh? ff [chan 
dx1. dx21-2 dh3 


Continuing in the same manner, he had in general 


, duh duh? | Came ame a d™u' ah. 


u=ur7 T Pees al T 
de 1. ax? 1-2 dx"-11.2.--(—1) dh" 


where the n-fold multiple integral [ " was zero when h = 0. If Newton’s formula 
(11.3) for f " had been used here, then the remainder would have emerged in the form 


R,(h) = ik =i us + t) dt. (11.7) 


In his 1823 lectures on calculus, Cauchy showed the equivalence of the two 
remainders by proving 
d” Ry d” 


= u(x +f). 


dh" dx" 


He applied the fundamental theorem of calculus and Leibniz’s formula for the 
derivative of an integral to obtain 


dR, 1 hd d” 
= kop! | t)dt 
dh | fee ae 


1 h qd” 
= a | (a= 1)" * u(x + dt. 


(11.8) 


Cauchy derived the desired result, and in effect a proof of Newton’s formula, by 
performing this process n times. Lacroix did not provide a proof of (11.7), merely 
noting that 
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n 1 aS 
/ ee wf tan" pn? 
cou 1) i 


[nian ea DO 2s [ niedn—--). 


This result is equivalent to Newton’s formula (11.5) and, as we have noted, Newton 
actually stated it in this form. Lacroix could have proved this inductively, using 
integration by parts. He did not mention Newton in this context; one may assume 
that he was not aware of Newton’s work, since Lacroix was very meticulous in stating 
his sources. 


11.7 Lagrange’s Derivation of the Remainder Term 


In his 1797 book, Fonctions analytiques, Joseph-Louis Lagrange (1736-1813) 
obtained the remainder term of the Taylor series as a single integral.** He started 
with 


fx = f(x —xz)+xP, 


The derivative with respect to z of this equation gave 


where P = O at z = 0. 
+ xP’ or P’ = f'(x — xz). For the second-order remainder, 


0 = —xf'(x — xz) 
Lagrange wrote 


fx = f(x —xz) +xzfl(x— xz) +x°Q 


and obtained 


Q! = zf"(x — x2). 
Similarly, 


272 


fx = fe —xz) +xzf' (x — xz) 4 S (bao cae x°R 


: vii : Mok 2 
and, after taking the derivative and simplifying, he got R’ = 5 f’(x — xz). Lagrange 
did not write the general expression for the remainder and gave only the recursive 
procedure. This method gives the remainder as an integral, though Lagrange did not 
write it in that form, since he avoided the use of integrals in this book. It is easy to 
see that 


x 


1 4 2m 1 2¢m 
Ro= > f tf ee eee) OR f° (u) du. (11.9) 


48 Lagrange (1797) pp. 43-45. 
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Had he stated it explicitly, the general form of Lagrange’s formula would have been 


2,2 
fx = f(x —xz) + xzf' (x — xz) > f"(% —xz)t+-:: 
n—1n—-1 
F et ae —xz)+x"Rp, 


where 


x" Ry —a fe (x — uy"! f™ ) du. 


cay pee 


If we replace x — xz by a, Lagrange’s formula becomes 


(x 


= 2 
FO) = fla) + «a f'a + FS pra t-- 
(Cee) ae 


1 x 
fo) 4 n=1 p(n) 
as @—D! fr (a) 4 an | @aiy fF" Odi. 


Thus, here the multiple integral remainder of d’ Alembert was replaced by a single 
integral. However, Lagrange himself gave only the derivative of the remainder. In his 
1799 lectures on the calculus, published in 1801 as Leons sur le calcul des fonctions,” 
he presented the remainder as it appears in modern texts, as an nth derivative. 

Lagrange proved a lemma for the purpose of determining bounds for R,: If f’x is 
positive for all values of x between x = a and x = bwithb > a, then fb — fa > 0. 
To prove this statement, Lagrange set f(x +i) = fx —iP, where P was a function 
of x and i, such that ati = 0, P = f’(x) > 0. So P(x,i) > 0 for i sufficiently small, 
and it followed that f(x + 7) — f(x) > 0 for small i. Next, he divided the interval 
[a,b] into n equal parts, each of length (j = bay, with n sufficiently large that in 
each subinterval 


la+kj,at+(k+)j], k=0,1,...,n-1, 
he had 


flatt(k+)j)-— flat+kj) > 0. 


By adding up these inequalities, he got fb — fa > 0. 

Lagrange’s lemma was correct but his proof was obviously inadequate. For 
example, he assumed that the same j would work in all parts of the interval. But 
he went on to use the result to derive a different form of the remainder. He supposed 
f'(q) and f’(p) to be the maximum and minimum values, respectively, of f’(x) in an 
interval. Then g’(i) = f’(x +i) — f’(p) and h’(i) = f’(q) — f’( +1) were both 
positive. Lagrange’s lemma then gave 


si) = fati)—f@—if(p)2=0, AO =if'@—fe+)+ fa) 20. 


49 Lagrange (1867-1892) vol. 10, pp. 91-95. 


266 The Taylor Series 


These inequalities implied bounds for f (x + i): 
f(x) +if"(p) < f@ +i) < f@)+if'@ 
and thus 
fix+i) = f@)tif'et+ia, 0<6<1, 


by use of the intermediate value theorem, implicitly assumed by Lagrange; similarly, 
he assumed that f’ had a maximum/minimum in an interval. More generally, Lagrange 
showed that f(x +7) lay between 


2 - 
f(x) +if'(x) 4 af") Ee nsts — ¢(p) 
: u! 
and 
iz ju 
f(x) +if'(«)4 af") Lede fq), 
: u! 


where p and q were the values at which f™) had a minimum and maximum, 
respectively, in the given interval. Once again, an application of the intermediate value 
theorem would yield Taylor’s formula with the remainder as a derivative: 


2 “UL 
FOF) = f+ OSH) +--+ T/C +61), 0<6<1. 
q Uu: 


11.8 Laplace’s Derivation of the Remainder Term 


After being launched in his career by d’Alembert, Laplace used his tremendous 
command of analysis to make groundbreaking contributions in his areas of interest, 
celestial mechanics and probability. In the second edition of his his famous 1812 
work, Théorie analytique des probabilités, published in 1814,>° Laplace used repeated 
integration by parts in a direct way to obtain the remainder term. He started with the 
observation that 


[ <¢'@-9 = 00) - 90-9. (11.10) 


when the lower limit of integration was z = 0. This result, the fundamental theorem 
of calculus, would be written in modern notation: 


[ (x — t) dt = o(x) — o(x — 2). 


50 Laplace (1814), especially pp. 176-177. 
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Using Laplace’s notation, integration by parts gave 


/ dz@¢'(x —z) = z¢'(x -2+ fcaco"x — 2), 


[eacore Za= seo" z4 [seaone — z) etc. 


Hence, in general 


eet gk ies zr (n) 
[ cow-o= we Aha gt Oak) te Seg gt re 
1 
n (n+1) = 
toa fe dzo (x — z). 


(11.11) 


Combined with (11.10), this equation provided Taylor’s theorem with remainder. 


. . s +1 
Laplace then converted this remainder to the Lagrange form. Since fz" dz = = 


n+1? 
n+l n+l 
me Mz" where m and M are the smallest 
n+l n+l 


and largest values of @*! (x — z) in the interval of integration. Hence, the value of 
the integral in (11.11) lies in between these values and is given by 


and 


the integral in (11.11) lies between 


zit 


ol oD — u), 


where u is some value between 0 and z. Thus, the remainder term in (11.11) can be 
written as 


zat 


COG 


oP D(x —u). 


This completed Laplace’s derivation of the two forms of the remainder in Taylor’s 
theorem. 


11.9 Cauchy on Taylor’s Formula and |’HO6pital’s rule 


In his lectures published in 1823,°! Cauchy took an interesting approach to Newton’s 
n-fold integral. He started with the differential equation 


d"y _ 
ae lx) 


Repeated integration of this equation yielded 


5! Cauchy (1823) pp. 138-143. 
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n— 


qty x 2y x 
af f@)dz+C, =f (x — 2) f(@)dz+C( —x0)+Ci, 
dx an i 


dx" 
be (x _ 7a al C(x _ xo)"—! 
= —————~ f (z) dz Free + Cy a Cn-1; 
[ SS z (eden 1) n—2(x — X90) + Ch-1 
where C, Ci, ..., Cn—1 were arbitrary constants. Here Cauchy used the result 


expressed in equation (11.8) to integrate in each step of the argument. The reader 
might compare this with Newton’s formula (11.5). Cauchy then proceeded to obtain 
Taylor’s theorem with remainder. He let y = F(x) be a specific solution of 
y™ = f(x) to obtain 


CIS FO): GIF Os coo CLE 7 Roy 
and with these values, he had 


_ _ F'(xo) ee a) n=l 
F(x) = F(x) 4 7 (x —xo) +---4 ldap" x0) 
x (x 5 z)"-! F(z) 


io Lea) 


Cauchy gave another proof,” that ran along the lines of Lagrange’s second proof. 
He started with the lemma: Suppose f(x) and F(x) are continuously differentiable 
in [xo,x] with f(xo) = F(xo) = 0, and F’(xo) > O in this interval. For x in this 
interval, if 


then 


pee BD Sos 
<= Fa = 


To prove this, Cauchy noted that since F’(x) > 0, he had 
f(x) —AF(x)>0 and f'(x) — BF’(x) <0. 
He then applied Lagrange’s lemma to the functions 
f(x) -—AF(x) and f(x) — BF(x) 


to obtain the required result. Cauchy then took x = x9+/ and applied the intermediate 
value theorem to derive 


52 ibid. pp. 161-168. 
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f(xo+h) — fo + 4h) 


= , where 0<6@ <1. 
F(xo th) F'(x%+ 6h) 


In the situation where f (xo) and F(x) were nonzero, he replaced f (xo + h) and 
F(xqo + h) by f(xo +h) — f(xo) and F(xo + h) — F (xo), respectively, to get the 
generalized mean value theorem: 


fQo0 +h) — fo) — f' Go +h) 


-_ , Where 0<6@ <1. (11.12) 
F(xo +h)—F(xo)  3=F’(xy + OA) 


He next supposed f’(xo) = f"(xo) = ++: = fo) = 0 = F' (x0) = F" (x0) = 
--- = F@-D (x9) and F™ # 0, and that all the derivatives were continuous. By 
an iteration of the process used to find the generalized mean value theorem, Cauchy 
deduced that 


fGoth)  f™ao+h) 
F(xo +h) F™ (xo + 6h) 


, where 0<6@ <1. (11.13) 


He then let h — 0 to deduce |’ H6pital’s rule 


ii, POL cite ED 
im = lim ———. 
x—> xo F(x) x—> x0 F® (x) 


From this result, Cauchy derived Taylor’s formula with Lagrange’s remainder by 
taking F(x) = (x — xo)” and replacing f(x) by 


(n—1) 
g(x) = f(x) — f (xo) — f’ (x0) (x — x0) — ++ a = ne (x — x9)"71; 


this vanished at xo, along with its first n — 1 derivatives. Then by (11.13), 
hh” 
g(xo +h) =—g™(xo+ 0h), 0<6<1. 
n} 
Since g(x) = f(x), the required result followed. 


Cauchy also obtained another form of the remainder>> by defining a function ¢(a) 
by the equation 


= erase (@— a) oy 
f(x) = f@4 I f(@d es} f(@at+-::: 
2 eS ay’! ae 
se ee | a 


In (11.12), taking F(x) = x,x9 = a,h = x —a,and f = ¢, he had 


o(a) = (x) + (a—x)d'(a+O(x—-a)), 0<0<1. 


53 ibid. pp. 147-148. Cauchy (1829) pp. 69-79, especially pp. 77 and 87. 
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Since 
= ! = (x — ay"! (n) 
g(x) =0 and $@) =- > Io. 
he concluded 
= n—-1 = n 
ioe” py es f™(at+0(x —a)). (11.14) 


(enn 15 


This remainder, called Cauchy’s remainder, can also be obtained from the integral 
form of the remainder. 


11.10 Cauchy: The Intermediate Value Theorem 


Recall that the intermediate value theorem was regarded as intuitively or geometrically 
obvious by eighteenth-century mathematicians. For example, Lagrange and Laplace 
assumed it in their derivations of the remainder. Bolzano and Cauchy saw the need for 
a proof and each provided one. Cauchy stated and proved the theorem in his lectures,** 
published in 1821: Suppose f (x) is a real function of x, continuous between xo and X. 
If f(xo) and f(X) have opposite signs, then the equation f(x) = 0 is satisfied by at 
least one value between xo and X. In his proof, Cauchy first divided [xo, X] of length 
h = X — xo into m parts to consider the sequence f (xo), f (xo + Ay, fort 2h Keates 
f(x - hy, ft (X). Since f (xo) and f (X) had opposite signs, he had two consecutive 
terms, say, f(x) and f(X’) with opposite signs. Clearly 
h X — x9 


xo <x, < X'< X and X’ xXj,= = 
m m 


We remark that Cauchy’s notation was slightly different in that he used < for <. 
He repeated the preceding process for the interval [x;, X’] to get x1 < x2 < X” < X’ 
with X” — x2 = 2 °. Continuation of this procedure produced two sequences, 
xo <x) < x2 < +--+ and X > X’ > X” >..--, such that the differences between 
corresponding members of the two sequences became arbitrarily small. Thus, he had 
the two sequences converging to a common limit a. Now since f was continuous 
between x = xo and x = X, the two sequences f(xo), f(x1), f(x2), ... and 
F(X), F(X), f(X”), ... converged to f(a). Since the signs of the numbers in the 
first sequence were opposite to the signs of the numbers in the second sequence, it 
followed by continuity that f(a) = 0. Since x9 < a < X, Cauchy had the required 
result. Observe that Cauchy assumed that a sequence, now called a Cauchy sequence, 
must converge; this was later proved by Dedekind. Bolzano’s slightly earlier proof of 
the intermediate value theorem had a similar deficiency, as he himself recognized in 
the 1830s. 


54 Cauchy (1989) Note V. 
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11.11 Exercises 


(1) eee Johann Bernoulli, consider the differential equation dy = 


(2 


wm 


aa | 
c vat dx for y = sinx, and take n = _ * dz = dx in Bernoulli's 


feral (11.6) to obtain 


Paar x? 


y _ ~*~ 7 23a 7 23.4504 
Fx, =<. x2 x4 
METS egg gaa 
Next consider the equation dy = wl and apply Bernoulli’s method to obtain 
his series for a In( ): 
ax ax ax? ax* 


— | | 

a a+x 2(a + x)? 

See Joh. Bernoulli (1968) vol. I, pp. 127-128. This paper was published in the 
Acta Eruditorum in 1694. 

Complete de Moivre’s outline of a method to obtain Bernoulli’s series for 

J y dz. Note that this is similar to Newton’s method of successive approxima- 

tion. Let the fluent of zy be zy — q so that zy = zy + zy —qg org = zy. Now 

let y = zv so that g = zzv. Take the fluent of each side to get g = 52zU —r for 


3(a+x)3 | 4(a+x)4 | 


some r. Then zzv = zzu+ 5zz0—F, so that 7 = 5220. Set v = zs and continue 
as before. De Moivre gave this argument in 1704. See Feigenbaum (1985) 
p. 93. 


(3) Show that Bernoulli’s series (11.6) is obtained by applying integration by parts 


to fn dz and then repeating the process infinitely often. 


(4) In the Bernoulli series for f ndz, setn = f'(z) to obtain 


(5 


(6 


wm 


wm 


f() — fO) = zf'(z) - a O+s rte a Bs Cane (11.15) 


Similarly, find the series for f’(z) — f’(0), f”(z) — f” (0), f/" (2) — f/”"(0), ... . 
and use them to eliminate f’(z), f(z), f’’(z),... from the right-hand side of 
(11.15). Show that the result is the Maclaurin series for f(z). See Whiteside’s 
footnote in Newton (1967-1981) vol. VII, p. 19. 

1 
Show that all the derivatives of f(x) = e * (x # 0), O(x = O) are zero at 


1 
x = 0. Cauchy remarked that the two functions e-* and e~* +e = had the 
same Maclaurin series. See Cauchy (1823) p. 230 and Cauchy (1829) p. 105. 
Show that the remainder in Taylor’s theorem can be expressed in the form 


hh” 
(n—1)! p 
due to Schlémilch and also published a decade later by E. Roche. See 


Prasad (1931) p. 90, Hobson (1957a) vol. 2, p. 200, and Schlomilch (1847) 
p. 179. 


d—6)"?-¢M~+o0n), 0<6<1, 
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(7) Prove that if f, @, and F are differentiable, then 


f(x+h) o(+h)  Fa+h) 
f(x) (x) Fx) |=0, 
f'(x+6h) b'(x+0h) F'(x+6h) 


for some 0 < @ < 1. This result and its generalization to n + 1 functions is 
stated in Giuseppe Peano’s Calcolo differenzial of 1884. 
(8) Leth > 0. Set m(x1,x2) = L[e)~f2) Define the four derivatives of Ff: 


x1 —X2 


@= Tim m(x +h,x), 


f+ @) = lim mx + A, x), 
h->0 

fo @= jim m@& —h,x), 

f-(@) = lim m(x — fA, x). 
h>0 


Show that if f(x) is continuous on [a,b], then there is a point x in (a,b) 
such that either 


ft (x) < m(a,b) < f-(x) or f (x) >m(a,b) > fi). 


The generalized mean value theorem is a corollary: If there is no distinction 
with respect to left and right with regard to the derivatives of f(x), then there 
is a point x in (a,b) at which f has a derivative and its value is equal to m(a, b). 
See Young and Young (1909). 


11.12 Notes on the Literature 


Malet (1993) explains that Gregory could have obtained the Taylor rule without being 
in possession of a differential or equivalent technique. Feigenbaum (1985) presents a 
thorough discussion of Taylor’s book, Methodus Incrementorum, as well as a treatment 
of the work of earlier mathematicians who contributed to the Taylor series. See also 
Feigenbaum (1981), containing an English translation of the Methodus. For later work 
on the Taylor series, especially the remainder term, see Pringsheim (1900). 

Grabiner (1981) and (1990) are interesting sources for topics related to the work 
on series of Lagrange and Cauchy. Grabiner shows that, although Cauchy did not 
accept Lagrange’s ideas on the foundations of calculus, Lagrange’s use of algebraic 
inequalities nevertheless exerted a significant influence on Cauchy. She further points 
out that, a half century earlier, Maclaurin made brilliant use of inequalities to prove 
theorems in calculus. See her article, “Was Newton’s Calculus a Dead End?” in 
Van Brummelen and Kinyon (2005). A look at Cauchy’s 1820s lectures on calculus 
from a modern viewpoint is in Bressoud (2007). 
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Integration of Rational Functions 


12.1 Preliminary Remarks 


The integrals of rational functions form the simplest class of integrals; they are 
included in a first course in calculus. Yet some problems associated with the 
integration of rational function have connections with the deeper aspects of algebra 
and of analysis. Examples are the factorization of polynomials and the evaluation 
of beta integrals. These problems have challenged mathematical minds as great as 
Newton, Johann Bernoulli, de Moivre, Euler, Gauss, and Hermite; indeed, they have 
their puzzles for us even today. For example, can a rational function be integrated 
without factorizing the denominator of the function? 

Newton was the first mathematician to explicitly define and systematically attack 
the problem of integrating rational and algebraic functions. Of course, mathematicians 
before Newton had integrated some specific rational functions, necessary for their 
work. The Kerala mathematicians found the series for arctangent; N. Mercator! and 
Hudde worked out the series for the logarithm. In 1676, Leibniz met Hudde and one of 
Leibniz’s short manuscripts contains notes, apparently made soon after that meeting, 
describing Hudde’s mathematical work. The first few lines of these notes deal with the 
logarithm:? 


Hudde showed me that in the year 1662 he already had the quadrature of the hyperbola, which 
I found was the very same as Mercator also had discovered independently, and published. He 
showed me a letter written to a certain van Duck, of Leyden I think, on this subject. 


Newton’s work was made possible by his discovery, sometime in mid-1665, of the 
inverse relation between the derivative and the integral. At that time, he constructed 
tables, extending to some pages, of functions that could be integrated because they 
were derivatives of functions already explicitly or implicitly defined. He extended his 
tables by means of substitution or, equivalently, by use of the chain rule for derivatives. 
He further developed the tables by an application of the product rule for derivatives, or 


! Mercator (1668). 
2 See Leibniz (1920) p. 123. 
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integration by parts, in his October 1666 tract on fluxions.* In this work, Newton 
viewed a curve dynamically: The variation of its coordinates x and y could be viewed 
as the motion of two bodies with velocities p and qg, respectively. He posed the 
problem of determining y when ; was known and noted: “Could this ever bee done 
all problems whatever might bee resolved. But by y° following rules it may be very 
often done.”4 2 

After giving the already known rules for integrating ax» when “ 4# —1 and when 
= —1, Newton went on to consider examples, such as the integrals of — a 
and ae" He did not take more complicated rational functions, perhaps because of 
a lack of an understanding of partial fractions. Instead, he evaluated integrals of some 


algebraic functions involving square roots. One result stated:> 


3n 


a ON, ad qn 7 F 
If opm =p Make a tox? = 2.9 [then] is 


cx" 3ac /zz-—a 
a N 4 px2n =y, 12.1 
pe InbbV b 4 el) 


The square symbol was the equivalent of our integral with respect to z, representing 
area. 

Newton was led deeper into the integration of rational functions by a letter from 
Leibniz dated August 17, 1676, addressed to Oldenburg, but intended for all British 
mathematicians.° In this letter, Leibniz presented his series for zr, 


i 4 
LA eae Beate (12.2) 


To obtain this series, Leibniz applied transmutation, a somewhat ad hoc method of 
finding the area of a figure by transforming it into another figure with the same area.’ 
Grégoire St. Vincent, Pascal, Gregory, and others had employed this method before 
Leibniz.® In his reply to Leibniz, of October 1676,’ Newton listed an infinitely infinite 
family of rational and algebraic functions, saying that he could integrate them. These 
included the four rational functions 


dzi-! alas 

e+ fz?+gz2n’ et fzl+ gin sa 
dz3"-! dzin! 

e+ fz74+ 9729 e+ fzl + gz2n oe 


where d, e, f, and g were constants and in the third and fourth expressions, in case 
e and g had the same sign, 4eg had to be < f*. Newton went on to observe that the 


Newton (1967-1981) vol. 1, pp. 400-448. 

ibid. p. 403. 

ibid. p. 409. 

Newton (1959-1960) vol. 2, pp. 57-75. 

Leibniz (1920) pp. 42-47. See also Leibniz and Knobloch (1993) pp. 78-80. 
See Hofmann (1974) chapters 5-7. 

Newton (1959-1960) vol. 2, pp. 110-161. 
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expressions could become complicated,!° “so that I hardly think they can be found by 
the transformation of the figures, which Gregory and others have used, without some 
further foundation. Indeed I myself could gain nothing at all general in this subject 
before I withdrew from the contemplation of figures and reduced the whole matter to 
the simple consideration of ordinates alone.” 

Newton then observed that Leibniz’s series would be obtained by taking 7 = | and 
f =O, and implicitly e = g = 1, in the first function. In fact, 


1 1 
a dz 2 4 1 1 1 
—— t l= oe — --)dz=1—— coy este ar 
=} arctan -| r a=] ( Z+2Z \dz= 375 at 


As another series, he offered 


ind sees (12.3) 


and explained that it could be obtained by means of a long calculation, setting 
2eg = f* and n = 1.!! Again taking e = g = 1, this leads to the expression 


——1 _. He did not clarify any further and apparently Leibniz did not understand 


a 2." 
ee sce even a quarter century later Leibniz had trouble with the integral arising in 
this situation. See Exercise 1 of this chapter. 

Some of Newton’s unpublished notes from this period suggest that he considered 
integrals of the form [ ohn , since these integrals would lead to simple and interesting 
series. To express them in terms of standard integrals (that is, in terms of elementary 
functions such as the logarithm and arctangent), Newton had to consider the problem 
of factorizing 1 + x” so that he could resolve the integrals into simpler ones by the 
use of partial fractions. We note that, in several examples in his 1670-71 treatise on 
fluxions and infinite series, he had broken up rational fractions into a sum of two 
fractions.!? In the 1710s, Cotes and Johann Bernoulli and, to a lesser extent, Leibniz 
pursued the algebraic topic of partial fractions with more intensity than did Newton. 
It may be noted in this context that even in 1825 Jacobi was able to make an original 
contribution to partial fractions in his doctoral dissertation. !3 
Newton’s method for finding the quadratic factors of 1 + x” was to start with 


(l+nx+x?)\(l—nx + px*— qx? t+rx4—---)= 14x" (12.4) 


and then determine the pattern of the algebraic equations satisfied by n for different 
values of m. In this way he factored 1 x,1l+x4,1—x°,14+ 914x814 x!2, 
though he apparently was unable to resolve the equation for n when m = 10. As an 
example, Newton found the equation for 1 + x* to be n? — 2n = 0, orn = +/2 and 
n = 0.'4 This would yield 


10 ibid. p. 138. 

‘1 ibid. 

12 Newton (1967-1981) vol. 3, p. 246. 
13 Jacobi (1969) vol. 3, pp. 1-44. 

14 Newton (1967-1981) vol. IV, p. 207. 
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xA41 s(x? + V2x +1) (x? — V2x + 1), (12.5) 
and, of course 
x4—1=(x7-1Q?+4+D, 


though Newton did not bother to write this last explicitly. Note that this factorization 
of x* + 1 was just what he needed to derive his series for 55 in (12.3). He also 


recognized that values of n were related to cosines of appropriate angles.!5 He was 
just a step away from Cotes’s factorization of x” +a”. 

Newton also considered the binomial 1 + x’ and found the equation for n to be 
n® — 5n* + 6nn — 1 = 0. Note that the solution involved cube roots; Newton did not 
write the values of n*, apparently because he wanted to consider only those values 
expressible, at worst, by quadratic surds. One wonders whether it occurred to him to 
ask which values of m would lead to equations in n solvable by quadratic radicals. In 
1796, Gauss resolved this problem in his theory of constructible regular polygons. !® 

In 1702, since Newton’s work remained unpublished, Johann Bernoulli and 
Leibniz in separate publications discussed the problem of factorizing polynomials, in 
connection with the integration of rational functions. In general, Leibniz and Bernoulli 
were of the opinion that integration of rational functions could be carried out by partial 
fractions, but the devil lay in the details. In his paper, “Specimen novum analyseos pro 
scientia infiniti circa summas et quadraturas,” Leibniz factored!” 


xttat=(x A ae i JEDe + ay J=1)(x ay sf =1), (12.6) 


He was puzzled by this factorization and wondered whether the integrals [ #5 


and f ae < could be expressed in terms of logarithms and inverse trigonometric 
x®°+a 
functions. Bernoulli’s paper, “Solution d’un probl&me concernant le calcul intégral’”!® 


also observed that the arctangent was related to the logarithm of imaginary values 


because 
/ dx - 1 / dx / dx (12.7) 
az+x2 2a atix a—ix) . 


Cotes made the connection between the logarithm and the trigonometric functions 
even more explicit with his discovery of the formula 


log(cos@ +isin@) = i6. (12.8) 


Roger Cotes (1679-1716) is known for his factorization theorem, his work on 
approximate quadrature, and for editing the 1713 edition of the Principia. He studied 


!5 ibid. p. 208. 

16 Gauss (1966) pp. 407-460, especially pp. 457-460. 
!7 Leibniz (1971) vol. 5, pp. 350-361, especially p. 360. 
18 Bernoulli (1968) vol. 1, pp. 393-400. 
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Ag 
A3 


Figure 12.1 Cotes’s factorization of x” — a” as a property of the circle. 


at Cambridge and became Fellow of Trinity College in 1704 and Plumian Professor 
of Astronomy and Experimental Philosophy in 1705. Unfortunately, he published 
only one paper in his lifetime, on topics related to the logarithmic function. Robert 
Smith published Cotes’s mathematical writings in a 1722 work titled Harmonia 
Mensurarum. Formula (12.8) was stated geometrically: !? 


For if some arc of a quadrant of a circle described with radius CE has sine CX, and sine of the 
complement of the quadrant X E, taking radius C E as modulus, the arc will be the measure of the 
ratio between EX + XC./—1 and CE, the measure having been multiplied by /—1. 


Observe that this statement translates to 1R In(cos 6 +i sin@) = R@, which is not 
quite correct, because the i should be on the other side of the equation. 

Cotes’s factorization theorem was stated as a property of the circle. In Figure 12.1, 
the circumference of a circle of radius a with center O is divided into n equal parts. 
A point P lying on the line OA, and inside the circle is joined to each division point 
Ay, A2, A3,...,An. Then, with x = OP, 


PA, - PAo--: PA, =a" — x". (12.9) 


Cotes noted that if P lay outside the circle, product equaled x” — a”, and he had a 
similar result for the factorization of x” + a”. 

Cotes wrote to William Jones on May 5, 1716,”° that he had resolved by a general 
method the questions raised by Leibniz in his 1702 paper on the integration of rational 
functions. Unfortunately, Cotes died two months later, but Smith searched among his 
papers and unearthed the new method. Smith’s note in his copy of the Harmonia 
stated:*! “Sir Isaac Newton, speaking of Mr. Cotes said ‘if he had lived we might 
have known something.” 

It is very likely that the source of Cotes’s inspiration was Bernoulli’s paper on the 
integration of rational functions, pointing out the connection between the logarithm 


'9 Gowing (1983) p. 50. 
20 Rigaud (1841) vol. I, p. 271. 
21 Gowing (1983) p. 141. 
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and the arctangent. In the second part of the Logometria on integration published 
posthumously in 1722, Cotes wrote that the close connection between the measure 
of angles and measure of ratios (logarithms) had persuaded him to propose a single 
notation to designate the two measures. He used the symbol 


RAT 
R = (12.10) 


to stand for RIn Ret when R? was positive; when R* was negative, it represented 
|R|@, where 0 was an angle such that the radius, tangent, and secant were in the ratio 
R: T : S. We should keep in mind that for Cotes, tangent and secant stood for R tan 6 
and R sec @. In his tables, he gave the single value 


2 |R+T 
a 


e S 


for the fluent of ser that is, for f at when R = /—*, T =x and S = 


xf (x? + ¥): Recall that when f < 0, the integral is a logarithm and when f > 0, 
the integral is an arctangent. Cotes’s notation distinguished the two cases by the 
interpretation of the symbols depending on R. This notation implies that when R 
is replaced by iR in the logarithm, we get the angular measure provided that S 
and T are replaced by Rsec@ and Rtan@. Thus, we have, where C is the constant 
of integration, 

iR+ Rtand 


iRin = RO+C. 
Rsecé 


When we take C = —R5 and 6 — 5 = 9, this yields 
In(cos@ + ising) = id. 


It may be of interest to note that, as de Moivre and Euler showed, this result connecting 
logarithms with angles also served as the basis for Cotes’s factorization formula. 
Surprisingly, Johann Bernoulli did not make any use of his discovery of the connection 
between the logarithm and the arctangent (12.7). British mathematicians such as 
Newton and Cotes were ahead of the Continental European mathematicians in the 
matter of integration of rational functions, but by 1720 the Continental mathematicians 
had caught up. In 1718, Brook Taylor challenged them to integrate rational functions 
of the form 


xml 


e+ fx™ + gx2m- 


Johann Bernoulli and Jakob Hermann, a former student of Jakob Bernoulli, 
responded with solutions.” In particular, they explained how the denominator could 


22 Bernoulli (1968) vol. 2, pp. 402-419 and Herman (1719). 
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be factored into two trinomials of the form a + bx 2 a cx™, And in 1720, Niklaus I 


23 
Bernoulli showed how to deal with the integral of ze i = a faa 


We can describe the Newton—Cotes—Bernoulli—Leibniz algorithm for integrating a 
rational function f(x) with real coefficients by writing 


N(x) 
fa)= PO cs .? 


where P, N, D are polynomials with degree N < degree D and where N and D have 
1 as their greatest common divisor. Factorize D(x) into linear and quadratic factors so 
that their coefficients are real: 


Dx) =e] [@-ai)* [[@? +b) +e)". (12.11) 


i=l j=l 


Then there are real numbers Ajx, Bjx, and C jx such that 


at Bjpx + Ci 
fx) = p+yy Aho LUE es ak (12.12) 


i=1 k=1 


From this it is evident that the result of the integration of f(x) contains an 
algebraic part, consisting of a rational function; and a transcendental part, consisting 
of arctangents and logarithms. 

Though Leibniz and Bernoulli had in principle solved the problem of the integration 
of rational functions, the practical problem of computing the constants a,b,c and 
A,B,C was formidable. In 1744, Euler tackled this problem in two long papers, 
running to 150 pages of the Petersburg Academy Journal (or 125 pages of vol. 17 of 
Euler’s Opera Omnia). In these papers he explained in detail how to compute A, B,C 
in (12.12) when the roots of the denominator were known. He also worked out a large 
number of special integrals of the form 


x” 


where m and n were integers. By evaluating these integrals, Euler gained insight into 
several important topics. In fact, they provided him with new proofs of partial fractions 
expansions of trigonometric functions; evaluations of zeta and L-series values; the 
reflection formula for the gamma function; and infinite products for trigonometric 
functions. It is not surprising that Euler published several hundred pages on the 
integration of rational functions! 

The problem of factoring a polynomial is a difficult one, so the partial fractions 
method has its drawbacks. A question raised in the nineteenth century was whether a 
part or all of the integral of a rational function could be obtained without factorizing 
the denominator. The Russian mathematician Mikhail Vasilyevich Ostrogradsky 


23 Bernoulli (1968) vol. 2 pp. 419-422. 
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published an algorithm in 1845 by which the rational part after integration could 
be obtained without factorization.** In 1873, Charles Hermite published a different 
algorithm and taught it in his courses at the Ecole Polytechnique.2> Ostrogradsky’s 
algorithm was essentially rediscovered by E. Horowitz in his University of Wisconsin 
doctoral thesis.7° 

With the development of general computer algebra systems, the problem of 
mechanizing integration, including the integration of rational functions, has received 
new attention. The methods of Ostrogradsky and Hermite, along with others, have 
been important in the development of symbolic integration. The question of obtaining 
the logarithmic or arctangent portion of the integral of a rational function, without 
factorization of the denominator, has been resolved by a host of researchers. In these 
symbolic integration methods, the problem of factorization is replaced by the much 
more accessible problems of obtaining the greatest common divisors and/or resultants 
of polynomials. These last procedures in turn require polynomial division and the 
elimination of variables. Contributors to symbolic integration are many, including 
M. Bronstein, R. Risch, and M. F. Singer.”’ 


12.2 Newton’s 1666 Basic Integrals 


In the beginning sections of his October 1666 tract on calculus, Newton tackled the 
problems of finding the areas under the curves y = lL and y = are equivalent 
to evaluating integrals of those functions. Recall, however, that seventeenth-century 
mathematicians thought in terms of curves, even those defined by equations, rather 
than functions. The variables in an equation were regarded as quantities or magnitudes 
on the same footing, rather than dependent and independent variables. Newton’s two 
integrals were the building blocks for the more general integrals of rational functions. 
It is interesting to read what he said about these integrals. He first noted the rule 


that if 


Note that in Newton’s 1690s notation a was written as x, whereas Leibniz wrote 


a Newton next observed:7° 
Soe [so] if 4 = Then is g x9 = y. soe y’ [that] y is infinite. But note y’ in this case x & 
y increase in y® [the] same proportion y’ numbers & their logarithmes doe [do], y being like a 
logarithme added to an infinite number §. [That is, fo ¢ dt =alnx—aln0 =alnx+ §.] Butif 
ax = >: 9 is also diminished by y® infinite number 0° & becomes 
finite like a logarithme of y® number x. & so x being given, y may bee mechanically found by a 
Table of logarithmes, as shall be hereafter showne. 


x bee diminished by c, as if 


as Ostrogradsky (1845). 

25 Hermite (1905-1917) vol. 3, pp. 40-44. 
26 Horowitz (1969). 

27 See Bronstein (1997). 

28 Newton (1967-1981) vol. 1, p. 403. 
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D 


V B 


Figure 12.2 Newton’s integration of a rational function. 


Here Newton was explaining how the logarithm could be obtained by an application 
of the power rule, since he regarded this as the fundamental rule for integration. 
Newton clearly saw the difficulty of division by zero when the rule was applied to 
<. His method of dealing with this stumbling block can now be seen as an attempt 
to define the logarithm as a limit. We may say that Newton was describing the 
calculation: 


x x 
= ’ ar = tim f ea ae 
o ct+t e>0 Jo (ce +t)!-€ 


te a é. pA \ c+x 
= lim (= (c+) c‘)) =aln = 


(12.14) 


As for the integral of aEee we would now give the result as arctan x. In Newton’s time, 
the trigonometric quantities or functions were conceived of as line segments and their 
ratios constructed in relation to arcs of circles. It was therefore natural for Newton to 
connect the area under y = ; to to the area of simpler or more well-known geometric 
objects such as conic sections. For this reason, he reduced the integral to the area of a 
sector of an ellipse.?° 

Consider the diagram in Newton’s tract (Figure 12.2). Set BD = v(x) and 


CB = z(x), where C is the point on the right-hand side. Let z(t) = ! _ 6 that 


Vi4 

tz(t) = JV1—22(t) = O Thus, the curve V D is an arc of the ellipse ay eg? =i. 
Note that CV = z(0) = 1. Ina one-line argument, Newton showed that is jf was 
equal to the area of sector CV D. To see this, observe that 


x 1 x Z(x) 
[ iet=f Pat = x22(x) - | 2Qzt dz. (12.15) 
1 


Since 2zt = v, the rightmost integral represents the area under the ellipse from V 
to B; when the negative sign is included with the integral, the area under the ellipse 
from B to V is obtained. Moreover, xz?(x) = HOVE) = BD. oe and hence xz?(x) 
represents the area of the triangle DBC, completing the proof. 

At this point, Newton may also have known that he could relate this area to an arc 
of the circle. Since 2zt = 2/1 — z?, the integral on the right of (12.15) is twice the 
area under the circle y = V1 — z2 from z(x) to 1. Recall that Newton had already 


29 ibid. p. 405. 
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related this area to the arcsine almost two years earlier, when generalizing a result of 
Wallis,°° to obtain (8.14). Thus, he knew that 


1 1 1 1 
/ 2et dz = 2 f V1—22dz =2(4 zv1— 22 5 arcsinz) . (12.16) 
z z 


(x) (x) 2 


When (12.15) is combined with (12.16), we get the integral in terms of the 
arctangent: 


ae cid 1 
i: dt = arcsin z(x) = arccos ————— = arctan x. (12.17) 
9 14P 2 Ji+e 


12.3. Newton’s Factorization of x” +1 


Most probably around 1676, Newton wrote his very sketchy notes*! on this factoriza- 
tion, of which Whiteside has given a very helpful clarification. From these sources we 
learn that Newton’s method of factoring 1 + x” was to write 


(l+nx 4 x*)(1 nx + px? — qx? +rx* a a | =1l+x" (12.18) 


and equate the coefficients of x! for 2 < i < m—1 to O in order to obtain 
equations satisfied by n,p,q,r,.... By eliminating p,g,r,..., he obtained the 
algebraic equation satisfied by n. For example, when m = 4, Newton’s equations 
were p + 1 —n* = Oand pn —n = O. Note that the first equation multiplied by n 
gives pn+n—n?> = 0, and hence, by the second equation, n> — 2n = 0. One can then 
write n = 0, +/2. The first case gives the factorization 


1—xt=(1—x7)(1+ x”) 


and the second gives 


(a= (1-v2x +x?) (1+ V2x +2"). (12.19) 


Recall that Newton applied this factorization to derive his series for win His cryptic 
remark on his method was apparently insufficient for Leibniz to decipher, so in 1702 
Leibniz could obtain only (12.6), leading him to wonder if faa could be expressed 
in terms of logarithms and arctangents. One may get a sense of the ill will existing at 
that time between the supporters of Newton and those of Leibniz from a remark in a 
1716 letter from Roger Cotes to William Jones:** 


30 Newton (1967-1981) vol. I, p. 108 
31 Newton (1967-1981) vol. 4, pp. 205-213. 
32 Rigaud (1848) vol. I, p. 271. 
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M. Leibnitz, in the Leipsic Acts of 1702 p. 218 and 219, has very rashly undertaken to demonstrate 
that the fluent of a = 7f cannot be expressed by measures of ratios and angles; and he swaggers 
upon the occasion (according to his usual vanity), as having by this demonstration determined a 
question of the greatest moment. 


Using the same method as before, m = 5 gave Newton the equation n+ — 3n?+1 = 


0, orn? = 3 + Je, orn = ee Thus, he got the factorization 


pasa (i434) (i LENS 1+V5 1 °). 


2 2 2 


Of course, the second factor could be further factorized as (1 — x)(1 4 < x 4 
x”), though Newton did not write this out explicitly. Newton explicitly gave the 
factorization of x® + 1. Here the equations satisfied by the coefficients n, p, g, r 
are g =nr, p= qn—r,n= pn —q,andn* — p = 1. This implies that 


p=n—-1, gq =n —2n, r=n' —3n? +1, (12.20) 
and hence the equation satisfied by n is 
n> — 4n? + 3n = 0. 


This indicates the values of n to be 0, +1, +./3. When n = 0, we have r = 1, g = 0, 
p = —1, and the factorization 


1+x°= (14x?) — x? +x). (12.21) 


When n = V3, we have rr = 1, g= /3, p = 2, and the factorization given by 
Newton was 


l+x= (1 t /3x 4 an) J3x + 2x? — 73x34 ie) (12.22) 


It follows that the second factor in (12.21) is 1 — x? +.x4 = (1+ V3x4+x7)(1 
J/3x +x?), and the second factor in (12.22) can be written as (1 +.x2)(1 — /3x +x’). 

Newton wrote down the polynomials satisfied by n form = 3 tom = 12 and solved 
the polynomials for those cases when n could be expressed in terms of quadratic surds. 
In the case m = 7, he had the equation n° — Sn* + 6n? — 1 = 0. He wrote “n? =” next 
to the equation and filled in no value when he realized it would involve cube roots. For 
m = 11, he did not appear to expect the solutions to be in terms of quadratic surds and 
wrote nothing after the equation for n. He appears to have missed the factorization, 
noted by Whiteside, 


n? —8n! | 21n> — 20n? | 5n = n(n4 5n2 | 5)(n* 3n? + 1), 


yielding the quadratic surds for n when m = 10. 
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Newton seems to have grasped the connection between the values of n and the 
cosines of the or — He drew a diagram of a right triangle with one angle as 
224° = ¥% and noted that 2cos § gave a value of n when m = 8. At this point, he 
was just one step away from Cotes’s factorization of a” + x”. Moreover, the number 
cos az is related to the length of a side of a regular polygon of m sides. Such a 
polygon is constructible when cos a can be expressed in terms of quadratic surds. It 
is unlikely that Newton considered constructible polygons, but he may have wondered 
about conditions for 1 to be expressed in quadratic surds. Thus, we have an interesting 
connection between Newton and Gauss, although Gauss could not have been aware of 
it because Newton did not publish his work, since it was incomplete. 


12.4 Cotes and de Moivre’s Factorizations 


De Moivre presented his method of factorizing the more general trinomial 
x7" —2cos nOx" +1 


in his Miscellanea Analytica of 1730.*? His method depended on a formula he stated 
without proof: Let / and x be cosines of arcs A and B, respectively, of the unit circle 
where A is to B as the integer n to one. Then 


x= svi ag eee : : ; (12.23) 


2Vi+ VP =1 


Note that this is equivalent to the formula named after de Moivre: 
1 
cos = 5 ((cosno +isinn@)" + (cosn@ — isin nd)") (12.24) 


De Moivre had published a similar result without proof for sine in a Philosophical 
Transactions paper of 1707.*4 In Chapter 13, we present a proof by Daniel Bernoulli in 
which he solved a difference equation obtained from the addition formula for cosine. 
Here we present Euler’s simple proof given in his Introductio in Analysin Infinitorum 
of 1748.° Observe that by the addition formulas for sine and cosine 


(cos y ti sin y)(cos z +i sinz) = cos(y + z) i sin(y + z). (12.25) 
By taking y = z, Euler had 
(cos y +i sin y)? = (cos2y +isin2y). 
When both sides were multiplied by cos y +i sin y, he got 
(cos y +i sin yy =cos3y +isin3y. 


33 de Moivre (1730a) pp. 1-2. 
34 de Moivre (1707). 
35. Buler (1988) p. 106. 
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Finally, it followed by induction that for a positive integer n, 
(cos y +isin y)” =cosny +isinny, (12.26) 


completing the proof. 
To obtain the factorization, de Moivre set z = // + V/2 — 1 so that 


2? —-l=VJl2-1 or 2" —2lz*+1=0, where I =cosné. 


2 


1 
By de Moivre’s formula (12.23), x = so: , Or Z 2zx + 1 = 0, where x = cos 0. 
De Moivre’s theorem was therefore equivalent to the statement that z” —2 cos n@z” + 
1 = 0, when z? —2cos6z+1 = 0. Thus, z? —2xz+1 was a factor of 22” — 2/z" +1. 
To obtain the other n — 1 factors, de Moivre observed that 


ee 1 Qkn tA _, [2kwn tA 
(cos A +isin A) = cos + 7 sin Pe <0 Fe ee 
n n 


The factorization thus obtained after taking 9 = 4 may be written in modern 
notation as 
= 2ka +A 
2" — (2cos A)z" +1=]| | (2 — 2cos (“=*) z+ i) (12.27) 
n 


k=0 


We note that de Moivre used the symbol C for 27. Cotes’s factorization theorems 
are actually corollaries of de Moivre’s (12.27). For example, let C be a circle of 
radius a and center O with B a point on the circumference and P a point on OB 
such that OP = x. Also let A}, A2,..., Ay be points on the circumference such that, 
fork = 1,2,...,n, the angle BOA; = ss Then the product Aj P-A2P---A,P 
is equal to x” + a”. This result of Cotes can be derived by taking A = z in (12.27). 
But by taking A = 0, we obtain Cotes’s result (12.9). 

Cotes’s factorization is most useful when viewed analytically; we will use it in the 
next section to integrate a specific rational function and later when discussing Euler’s 
factorization of trigonometric functions. Set A = 0 in (12.27) so that the left-hand side 
is (z” — 1)?. Observe that the factor corresponding to k = 0 on the right-hand side is 
(z—1)? and the factor corresponding to k > 1 is identical with the factor corresponding 
ton — k, because 


COs 


20k _ 2m (n — k) 


n n 


Thus 


st 
z™”—1=(z-1) I] (2? - 2005 241), 
n 
k=1 
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Now, putting z = - and multiplying by y”, we obtain 


[st] 


20k 
x" —y" =(x—y) I] (x 2cos — wy ty); 
k=1 


when n is odd, we can write 


Ea 


20k 
x"+y"=(x+y) I] (x? 2c08 = xy +92). 
k=1 " 


12.5 Euler: Integration of Rational Functions 


In a letter to Niklaus I Bernoulli dated January 16, 1742,°° Euler wrote of his efforts 
to evaluate integrals of the form 
lee) xml dx 
7 Tix"? 


where m and n were integers, and how such integrals could be applied to obtain, for the 
first time, the partial fractions expansions of trigonometric functions. Since Euler was 
in the habit of communicating his important ideas to his interested colleagues almost 
immediately, this letter indicates that Euler probably began work on integration of 
rational functions sometime in 1741. One difficulty in this task is the factorization of 
the denominator. Now Euler was aware from the work of de Moivre that a polynomial 
of degree 2n could be factorized if it took the form x?” + 2cos@ x” + 1. Thus, Euler 
considered integrals of rational functions, 


xml 


x27 4+ 2cosOx? +1? 


(12.28) 


and in so doing, found a gold mine. He discovered results in this area that were 
intimately connected with many of his most significant and famous results in mathe- 
matical analysis: the gamma function, the product and partial fractions representations 
of trigonometric functions, the evaluation of zeta values, yy aE and the values of 


some L-series such as )°~° 9 OATH 
on the integration of rational functions and its connections with these topics, as he 
returned to them again and again, working out alternative proofs and extensions of 
his results. In 1742, he presented two papers to the Berlin Academy,” published in 
1743. In these papers, Euler evaluated the integral of the expression (12.28) and, in 


particular, he showed that 


. It is no wonder that Euler wrote many papers 


36 Eu. IV A-2 pp. 483-490. 
37 Bu. 1-17 pp. 1-34, 35-69. E 59, E 60. 
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ee) xml 1 mit 1 xml y2n-m—l 
dx = csc = dx 
0 0 


14+ x2" 2n 2n 1+ x2" 
1 1 1 1 1 
=—4 Hee, (12.29) 
m 2n-m 2n+m 4n-—m 4n+m 
as well as a similar result for the principal value of lan anak. In his paper 


De inventione integralium,>® he evaluated the integral by factorizing 1 + x7” using 
Cotes’s formula and then obtained the indefinite integral as a sum of logarithms 
and arctangents. In his Theoremata circa reductionem formularum integralium ad 
quadraturam circuli,*? Euler followed a different route to the evaluation of the integral 
in (12.29). He first showed that 

ue mi 1 1 1 1 1 


csc = tee (12.30) 
2n 2n m 2n-m 2n+m 4n-—m 4n+m 


by a method to be discussed in our Chapters 15 and 16. The series on the right-hand 
side of (12.30) is called the partial fractions expansion of 5 csc 5%. Next, Euler 
showed how to convert the series of fractions on the right-hand side of (12.30) into 
an integral. For that purpose, he considered the series 


m x 2n-m x 2ntm xfn-m xfntm 


x 
S(x) = pees 
iw) m 2n-m 2n+m 4n—m 4n+m 


whose derivative would be the sum of two geometric series: 


dS ; 
ym ly y2n-m—-l _ yy tntm—1 _ An-m—-1 |, yAntm—-1, 
dx 
=," 1 _ ,2n+m 1, y4ntm a ee ee 
=x"-lq 2" xn fee ge eT 3h Shag Sa bacal) 
xml x 2n—m—1 


= , 12.31 
1+x2" 1+ x2" ( ) 


Upon integrating (12.31), he obtained 


1 1 1 
ea cakes | 2n—m 2n+m 
1 m—1 1 ,.2n—m—-1 
x x 
=i 77 dX / 2n 
o l+x o l+x 
1 xml ee) Ricee 1 
= d d h =: 
i ae x i Tp yn y (» ere y ~) 
ee) xml 
=| 1+ x2" 
38 £60. 
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As we discuss in Chapter 16, Euler’s first derivation of (12.30) met with objections 
based on questions of convergence and zeros of the sine function. He therefore sought 
and found an alternative proof using integration; for this proof, see Section 12.6. First 
note that in his papers during the 1740s, Euler habitually assumed that the reader knew 


the value of the integral 
/ (Ax + B)dx 
b?x2 — 2ab x cos¢ +a? 


in terms of logarithms and arctangents. However, in chapter 1 of the first volume of his 
book on integral calculus, presented to the Petersburg Academy in 1766 and published 
in 1768,*° he gave the details of this calculation. He first noted that 


d 
a * _ 2abx cost +a’) = 2b*x — 2ab cose. 
XxX 


Thus 


| (Ax + B)dx 
b?x? — 2abx cos¢ + a? 


_A | (2b?x —2ab cost)dx i (B+ 44 cos g)dx a5) 
~ 2b? J bx? —2abxcos¢+a? ' J (bx —a cost)? +a? sin? ec’ 
Euler observed that the first integral on the right-hand side of (12.32) was 
UVP a peicog hibit): (12.33) 
2b? 


for the second integral, he set bx — a cosé = (a sin¢)v so that dx = asin’ dy and 
the integral became 
Bb+ Aa cos¢ dv _ Bb+ Aa cos¢ bx —acos¢ 


= tan ———_—_—_. 12.34 
ab* sing vt ab* sing eee ae sin ( ) 


Euler then gave the arctangent in an alternate form, noting that since arctan (85) 


was a constant, it could be added to the arctangent in (12.34) without affecting the 
value of the indefinite integral. He had 


bx —acos¢ cos ¢ 
arctan ———————— + arctan | ——— ] = arctan 
a sing sin ¢ 


bx sing 
—— Se (12.35) 
a—bx cost 
by employing the formula 
A+B 
1—AB 


arctan A + arctan B = arctan 


40 Bu. 1-11. E342. 
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Euler devoted two long papers, coming to a total of 150 pages, to the evaluation 
of indefinite integrals of rational functions.*! Both were presented to the Petersburg 
Academy on the same day in 1748 and published consecutively in 1751 in the 
Academy journal. Both papers tackle nearly the same problems, though the second 
offers simplified methods in some cases. For example, in the first paper, Euler wrote 
the polynomial in the denominator of the integrand in the form 


N(x) = (14+ px) +qx)+rx)---, 


where some factors could be repeated. Suppose M(x) is a polynomial whose degree 
is less than that of N(x). Now allow that 1 + px is repeated exactly n times and that 


N(x) = (1+ px)" AQ), 
M(x) | C2 Cn _ DO) 


= yore ‘ (12.36) 
N(x) 1+ px (+ px) (1+ px)” A(x) 
and 

M(—-1 
V(p) = p" '—_—. (12.37) 

A(- 3 

Euler showed after some work that 
D dn-k V 

Ch = ki Dea 12.38 
‘a —b! dp" (5 . ( ) 


In the second paper, Euler presented the formula for the case where the factors were 
of the form (p + gx)". He let N(x) = (p + qx)"S and 


M(x) bt et, DX) 
= T ay [ae eee t ‘ (12.39) 
N(x) (p+qx)" (p+qx)" p+qx  S(x) 
To find b;, he then multiplied (12.39) by (p + qx)" to get 
M(x) . 
= bo + bi(p4 Pee Dp ge)! apts 
Sia) o+bi(p+ qx) j(p + qx) 
D(x 
+ dn—-1(p + qx)" | + ey + qx)" 
S(x) 
and thus 
1 di (M 
bj = — | —([— , J =0,1,2,...,n—-1. 12.40 
eS jae le) x : i nee 
q 


Returning to our discussion of Euler’s paper on definite integrals, “De inventione 
integralium,” note that he started off with a consideration of sums of sines and cosines 


41 Eu. 1-17 pp. 70-148, 149-194. E 162, 163. 
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that he required to evaluate definite integrals. For example, in Problem 5 of his paper 
he summed 


S=asina+ (a+ b)sin(a + u) + (a+ 2b) sin(a + 2u)+--- 
+ (a+ (p — 1)b) sin(a + (p — Iu) 


to obtain 


(a — b) sina — asin(a — u) + (a+ pb) sin(a + (p — lu) — (a+ (p — 1b) sin(a + pu) 


2—2cosu 
(12.41) 
To prove this, observe that 
p-l 
(2cosu)S = » (a +kb)2sin(a + ku) cosu 
k=0 


p-l 
= 0 (a+ kb)(sin(a + (k = 1)u) + sin(a + (k + Iw) 
k=0 


p-l 
=a sin(a —u) — (a —b) sina +2 S> (a + kb) sin(a + ku) 
k=0 
— (a+ pb)sin(a + (p — 1l)u) + (a+ (p — 1)b) sin(a + pu). 


Rearranging the terms yields (12.41). 
Later in the paper, Euler showed that for 0 < m < 2n 


love) m—lq 
: SE get weeps (12.42) 
0 


1—x2”—2n 2n’ 


in which we note that the integral is improper at x = 1, being undefined at that point. 
The first definition of an improper integral was later given by Cauchy in the 1820s.*? 
In fact, the integral in (12.42) does not exist; Euler had in fact found its principal value, 
also defined by Cauchy* as 


l-e b xml 
E> or 0 l+e 1—x" 


b> ow 


Euler noted that, by Cotes’s factorization, 


n-1 


k 
1— x?" = (1 — x?) gl (1 + 2x cos = +2) (12.43) 
k=1 


42 Cauchy (1823) p. 93. 
43 ibid. p. 96. 
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He first determined the value of the indefinite integral; note that we will give the 
details of this evaluation in the next section, when we consider a more general integral. 
For the indefinite integral, he had 


xml eh} ; 
/ et ane (In(1 + x) — In(1 — x)) 


at a ae 2k —1 
ices MO, ar "2 tn (142% cos $= 97 4.2) 
n n 


2n ae 
Ayelet x sin k= 
+ ae sin arctan —_—ze tC. 12.44) 
2n ay 1+ xcos = 


He observed that if x were replaced by i, then 


xml xen—m—l1 
i 1— x2" BX ii 1— x2" we 


so that 
i xml is le) xml 

x= T Xx 

0 1- xan 0 1 1— xen 

1 xml x2n—m—1 
= 5 dx 

0 1—x-" 
He next expanded ae as a geometric series, 1 + x2” + x4” +--+, to obtain 


oo xml 1 
i, are | Qt ee aiemmliamm (IRM ae eas a ae aoe Dy 
0 = 0 


1 1 1 1 1 1 


m 2n—m 2n+m 4n-—m 4n+m 6n—m 


(12.45) 


Since (—1)7"-™ = (—1)” and cos j Ca—mn = cos kmn by applying (12.44), 
Euler had 


2n—m—1 _4ym-1 
[+ = pa (In(l + x) — In — x)) 


1-x 2n 
ee ee 2k —1 
recat Aas S~ cos In(1-+2xc0s $—O* +. 7) 
2n a n n 
ja x sin 
aes sin arctan —____"” +C. (12.46) 


ka 
n n pee 
k=l 1l+x COs Vi 
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When (12.46) was subtracted from (12.42), all logarithmic terms cancelled; hence 


xml _ y2n-m-1 e1y! n—l _ kaa x sin 
5 dx = 2——_ ss sin —— arctan ————— 
1—x™ n ae n 1+xcos 
and thus 
s Se _ 1 n-l ‘ 
1 ym—-1 _ ,2n—m—1 iets _ kma ka 
5 dx = 2———_ ‘s sin arctan ————_. 
0 1—x*" n ae n 1+ cos * 
Next, since 
sin ka 2 sin ka cos kr kr 
= = tan 4 
1+cos 2cos2 k 2n 
n 2n 
sin in kr kx 
arctan ———"—— = arctan (tan —) ee 
1+ cos = 2n 2n 
Thus, Euler got 
lo) xml 5 1 xml — x2n-m—1 F 
1— x2" em 1— x2" ‘: 
0 0 
n—-1 
a aaa _ kma 
= — ee Dy k sin ae (12.47) 
kot 


Now Euler evaluated the sum contained in (12.47) by taking p = n, a = 0,a = 0, 


b= 1,andu = mn in (12.41): 


¥ eet kmx _ nsin ((n — 1)"®) — @— 1) sin(nz) 
a mir 
= n 2 — 2cos = 
s MmIv 
7 nsin (mx a a) 
4 sin” or 
n 
n(—1)"—! sin 
= + 2 mt 
4 sin Oa 
mit 
sf ( yy! n Cos Tn 
2sin 5% 
n 


Hence 


(12.48) 
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Euler combined (12.45) with (12.48) to discover his famous partial fractions 
expansion of the cotangent function: 
a mi 1 1 1 1 1 


cot — tee, (12.49) 
2n 2n m 2n-m 2n+m 4n-—m 4n+m 


We remark that this identity led Euler to one of his half-dozen or more evaluations 
of 5 Tr where k is a positive integer. We work out some of these evaluations in 
Chapters 15 and 16. In his “De inventione integralium,” Euler considered the integral 
of (12.28), but he did not include a result for this integral similar to (12.45). Whether 
or not he was aware of such a result in 1748, he did not publish it until thirty years 
later and we discuss it in the next section. 


12.6 Euler’s “Investigatio Valoris Integralis” 


. . noe a m—1 F 
Euler’s paper “Investigatio valoris integralis = -E att Fae a termino x = 0 ad 


xX = oo extensi” was presented to the Petersburg Academy in 1775 but was published 
ten years later by the Academy in the second volume of his Opuscula analytica.*+ To 
evaluate the integral in the title of the paper, he used methods very similar to those 
he employed in his 1743 “De inventione integralium.” However, he found some new 
corollaries such as the Fourier series for cosAx and sinAx; these appeared as the 
partial fractions expansion for the integral. Note that the denominator of the integrand 
can be factorized by de Moivre’s formula (12.27) and the integrand may be expressed 
as a sum of partial fractions when m < 2k: 
ge ee Ay + Byx 
1—2x* cos@ +x? dX 1 — 2x cos (234) pee 


(12.50) 


To calculate the As and Bs, Euler observed that 
1 — 2x cos@s + x7 = (x — ee! )(x — e 1%), 


Here note that Euler did not employ subscripts and except on rare occasions he 


wrote cos w; + /—I sina, instead of e! ®s. In any case, he let wy = isnt and set 
As + Bsx ; 
AY Ss = fs 8s : , (12.51) 
1—2xcos@s+x2 x—e® x —eiOs 


Now by multiplying equation (12.51) by the denominator of its left-hand side, 
he had 


As + Bsx = fs(x é! Os) 4 &s (x é =) 


44 Eu. 1-18 pp. 190-208. E 589. 
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and equating coefficients of the powers of x, Euler arrived at 
By = fstgs and Ay = —(foe' + gse!). (12.52) 
To find f,, Euler noted that (12.50) when multiplied by x — e! ® yielded 


x™l(x = el @s) 


1 —2xk cos 6 + x24 


= fo + R(x —e!%), (12.53) 


where the Rs consisted of the remaining partial fractions. He let x — e!®. Now 
denoting f(x) = x7* — 2x* cos6 + 1 anda = e's, one gets f(a) = 0. Also, since 


5 = t+" and using I’ H6pital’s rule: 

gta liy = a) q”—! qi” 

fy = lim =—— = — (12.54) 
xa f(x)—fla) fila) af'(a) 
ellos 
~ 2kerikes — 2keikes cos 
eles 
= ; 12.55 
2ke*!? — 2ke!? cos ( 
simplifying to 
elas 

Z _ 12.56 
2ki sin e'? ; 


As Euler remarked, g, could be obtained from f; by changing every ./—1 to —/—1 
or, in other words, i to —i, since gs is the complex conjugate of f;. So 


ei Os 

8s “oki sind e- 

and then by (12.52) 
fete ne) (12.57) 
2i(k sin@) ksin@ 

i(m—l)@s—i8 __ ,—i(m—1)+i0 day =i _6 

Pee fe aie is ((om mee (12.58) 

2i(k sin @) k sin@ 
Now 


i (As + Bx) dx / ( Bs (x — COS @s) (As + Bs COS Ws) ) ae 


1—2xcosw,+x2 1—2xcosw,+x2 | 1—2xcosa, + x2 
1 As + Bs; cos sin 

= —B,; In(1 — 2x cosa, 4 x?) p= = 7 - arctan ae (12.59) 
2 SIN Ws 1 — x cosas 
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Thus, by (12.50), 


le) x™-ldx 
12.60 
[ 1 — 2x* cos@ + x2k ( ) 


was shown to be a sum of the integrals in (12.59) as s ranged from 0 to k — 1. Now 
note that the integrals are on the interval (0,00). At x = 0, the value of the logarithm 
is zero, as is the value of the arctangent. Euler observed that for large x 


1 2 
In(1 — 2x cos ws sey?) = Inx’ 4 in( ee i) x 2Inx. 
x x 


It is now clear that, by (12.60) and because ws = oe 
integral in that formula must behave as two times the pan 


k-1 k-1 
Inx 0 mir 
1 Bs = i = hy 429 
ne = Fane % a ( Wis er ) 


s=0 


, the logarithmic part of the 


In 
~ k sin6@ 


=> sin(2sa + €), (12.61) 


where a = "7 and ¢ = Mea Euler noted that 
2 sin sin(2sa + ¢) = cos ((2s — 1)w — ¢) — cos ((25 + 1a + ¢), (12.62) 


meaning that when (12.62) was summed over the range s = 0 to s = k—1, cancellation 
would leave only the first and last terms remaining. Thus 


k-1 
2sina ) © sin(2sa + ¢) = cos(a — ¢) — cos ((2k — 1)w + ¢). (12.63) 
s=0 


Now the sum of the angles a — ¢ and (2k — l)a + ¢ is 2ka = 2mm; hence, the 
cosines of these two angles are the same and the sums in (12.63) and (12.61) are 
zero, Or 


k-1 
Inx y B, = 0. 
s=0 


Euler thus demonstrated that (12.60) had a vanishing logarithmic part. He pro- 
ceeded to calculate the arctangent part in (12.59); though he did not employ the 
language of limits, we use them for convenience: 


: X SiN Ws 
lim arctan 


x—>>CO 


——_——_————_]} = arctan(— tana, ) 
1—xcosay 


= arctan(tan(z — @;)) 


= — Os. 
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In addition, by (12.57) and (12.58), using the addition formula for sine, and noting 
that (m — 1)w, — 6 = (may — 0) — ws, Euler did a short calculation to obtain 


As + Bs cosw@s _ cos(mas — 8) 


SiN Ws k sin@ 
Therefore, the sum of the arctangents at x = oo is 


k-1 
S(t — ws) cos(mas — 8). (12.64) 
s=0 


ksin@ 


Set = Ba - 2 = y and let a,¢ be as before. Euler denoted the sum in (12.64), 


1 
k sin@? 


omitting the factor by S and noted that, since 


I—-—W;=y—2spB and mo,—0=2sa+, 


the addition formula for the sine function implied 


k-1 
2S sina = » 2(y — 2sB) cos(2sa@ + €) sina 
s=0 
k-1 
= >) (y — 2sB)(sin((2s + Da + ¢) — sin((2s — a + ¢)) 
s=0 
= —y sin(—a + ¢) + (y — 2(k — 1B) sin((2k — a + ¢) 
k-1 
+> (y —2sB — (vy — 2(s + 1B) sin (2s — Dow + £) 
s=l 


= ysin(a —¢) + (y — 2(k —1)B)sin(2k — l)a+¢)+f6T (12.65) 


where T = 2(sin(a+¢)+sin3a+)+---+sin((2k —3)a+4+C). 
Euler summed T by using the addition formula for cosine, arriving at 


T sina = cos¢ —cos(Qa + ¢) + cos(2a + ¢) —cos44a+0)+--- 
+ cos ((2k —4A)a + c) — cos ((2k —2)+ c) 
= cos — cos ((2k —2)a+ c) 


O0(k — m) 2mn + 0(k —m) 
= cos cos 
k k 
_ me+0(k—m) , ma 
= 2sin sin 
k k 


= 2sin(a — €) sina. 


Thus, T = 2sin(a — €). 
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Euler substituted this value of T in (12.65) and applied the identity sin A+ sin B = 


2 sin Ls cos ae so that 


2S sina = (y + 28) sin(a — ¢) + (y — 2(k — 1)B) sin(2k — 1l)a+ ¢) 
= (y + 2B)(sin(w — ¢) + sin((2k — a + £)) — 2k Bsin((2k — l)w + 6) 
= (2y + 4B) sinak cos((k — l)a + 6) — 2kB sin((2k — l)a + €). 


(12.66) 
Now in (12.66), sinak = sinmz = 0. So 
sin((2k — l)a + ¢) sin ma +o(k—m) 
S = —kp - = ay 
sin a sin 7 
Thus, Euler had the final result 
ee Res a sin (eB) + Ke 
dt. = 12.67 
[ 1 — 2x* cos 6 + xk ss k sin sin "2 ( ) 
The special case 6 = 7 gave Euler the value of the beta integral 
oo xml 1 
dx = ; 12.68 
i 1 xk ~ Dk sin BE wae 
and using integration by parts, or from 6 = zr in (12.67), he obtained 
oc ym—1 d J —@)r 
/ aS ey a a) (12.69) 
0 (+x*)? — ksin ™ 


In fact, Euler himself mentioned his use of integration by parts in this case in 
section 34 of his 1743 paper on integration, published by the Berlin Academy. 

At this point, Euler set 7 — 6 = n, so that sin@ = sinn and cos@ = —cos7 in 
(12.67), and obtained 


= (12.70) 


te di msin(l — %)n 
0 1+2x* cosn+x** — ksinnsin 3 


He next gave an interesting method for expanding the integrand of (12.70) as a 
series in powers of x. He wrote 


sin n 
1+ 2x cos + x2k 


= sinn + Ax* + Bx 4+ 0x°* + Dx 4---, (12.71) 


multiplied this by the denominator of its left-hand side (1 + 2x* cosn + x7), and 
arrived at 
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sinn = (1 + 2x* cos n + x7*)(sin n + Ax* + Bx 4 Cx3* 4 +++) 
= sinn + (A+ 2sinncos n)x* + (B+ 2Acosn +sinn)x7* +---. 


Equating coefficients of the powers of x, Euler could write the relations: 


2A +2sinncosy =0 or A=-—sin2n 
B+2Acosyn+sinn=0 or B=2sin2ncosn—sinn 

= sin(2n + n) + sin(2n — n) + sinn 
= sin3n 

C+2Bcosn+A=0 or C = —(2sin3ncos 7 — sin2n) 

—(sinn + n) + sin(37n — n) sin 27) 

= —sin4n 

D+2cosn+B=0 or D=sin5n, 


and so on. Thus, after substituting these values in (12.71): 


sin n 
1+ 2x* cos + x24 


= sinn — sin2n x* + sin3n x* — sin4dnx°* +---. (12.72) 


Euler noted that 


; [ x™—ldx [- I xm ldx 
acai 9 14+2xk or ae aa 1+ 2x* cosn + x2k 


xm To y2k— m—-1 
= dx, 12.73 
sinn [ 1+ 2x* cos + x2 . ( ) 


2k—-m—1 : : L; : oo 
where the term x was obtained by changing x to < in the integral J, On term 


by term integration and using (12.72) in (12.73), and applying (12.70), he had 


7 sin (1 = rau ae 2k—m—1 = s-1(s—Dk ¢: 
eee -| (x +x ) yee) xs sin 57 
k 


s=1 


Sas s—l1 1 f 1 : 
Be 1)’ (= Peay = -) sins. (12.74) 


s=l 


Euler next assumed that 7 was an infinitesimal, so that 


: m m : : 
sin (1 — aL = (1 — ao sinn=y, sin27n =2n, and soon; 
after division by 7, he obtained 


(l-—2)x 7 = da 257k 
k sin 32 = a (m+ (s — Dk)((s +1)k —m) 


sal 


(12.75) 
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In modern terms, Euler obtained (12.75) by dividing (12.74) by n, allowing n — 0, 


and applying lim, —,0 ae = s.Heset m = k—n to put (12.75) into more symmetrical 
form: 
a sin “2 S. (—1)57!s sins 
k 1) 

———_ = so 12.76 

2k? sin dX s*k? — n? ( 
and then differentiated to derive 

mn cos “7 = 3 (—1)°—!s?2 cos sn 

2k3 sin 2% s7k2—n? 

s=l 
He also integrated (12.76) to obtain 
m cos 1 &. (-1)5!s cos. sn 


2nk sin "=~ 2p? ' me s2k2 — 2 


However, he also required (12.49) to determine the value of the constant of 
integration oe 

Now in 1775, at the time he wrote the paper we are discussing, Euler was well 
aware of the factorization 


1+ 2x* cosy +x = (1+ elx*) 1 + ec ix*), 
See, for example, his paper “Summatio progressionum,”*° presented to the 
Petersburg Academy in 1773, published in 1774. He gives this factorization explicitly 


in exemplum 1, near the end of the paper, where he also gives the formula, that may 
be compared with (12.72): 


x sing 


1—2xcos¢ +x? =xsing +x sin2o +x? sin3o+---, 


that involves only the geometric series, equivalent to (12.72): 
2i sing _ 1 1 
(1 —ei¢x)(1 —e-‘x) 1 —eitx = 1 — ei x 
=14 CP — ci) x $ (7!F — 0b)? 4, 


that would have yielded a simpler derivation. But Euler’s purpose here was rather to 
give multiple derivations of the same result. 


12.7 Hermite’s Rational Part Algorithm 


As a professor at the Ecole Polytechnique, Hermite lectured on analysis. This gave 
him the opportunity to rethink several elementary topics. He often came up with new 


46 Eu. 1-15 pp. 168-184. E 447. 
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proofs and presentations of old material. In his lectures, published in 1873, Hermite 
gave a method for finding the rational part of the integral of a rational function, by 
employing the Euclidean algorithm.*’ He first found the square-free factorization of 


the denominator Q(x) of the rational function 58 : 
Q=0103--- Qn. (12.77) 


where Qj, Q2,..., Qn were the relatively prime polynomials with simple roots. This 
decomposition could be accomplished by the Euclidean algorithm, but Hermite did 
not give details in his published lectures. Note that there existed polynomials P;, Po, 
..., Py, such that 

P Pr | Po Py, 


0 0 Oo | Or 


(12.78) 


As a first step in the derivation of this relation, Hermite observed that U = Qj, 
and that V = O3 --- Q" were relatively prime and hence by the Euclidean algorithm, 
there existed polynomials P; and P; such that 


P=P\V+P,U 


or 


PoP. BP 
QO QO Q2--- Qn 


(12.79) 


Hermite obtained the required result by a repeated application of this procedure. 


Since 
[o-/s 22 da = (12.80) 
OQ OQ Q5 Qn 


he needed a method to reduce [ ee to f ae for some polynomial E, when k > 1. 
k k 


Since Q; had simple roots, Q; and its derivative Q), were relatively prime. Thus, there 
existed polynomials C and D such that 


Pr = COx + DO, 


and 


Pe 3G .  DOL Dd(f{1 
OO OE Oe TRL OTT: 


47 Hermite (1905-1917) vol. 3, pp. 40-44. See also Hardy (1905) pp. 13-16. 
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After integration by parts, he obtained the necessary reduction 


D’ 
[a- D = (12.81) 


Ok (k-1) or! GY 
Again, by a repeated application of this algorithm, Hermite had 
P P P. P. P 
/ Sp ey a a a fe (12.82) 
Q Q1 Qo Q3 Qn 
where R was a rational function. 
Since Q1, Q2,..., Qn were pairwise relatively prime and had simple roots, the 


integrals on the right-hand side formed the transcendental part of the original integral. 


12.8 Johann Bernoulli: Integration of /ax? + bx + ¢ 


We know that Isaac Barrow geometrically evaluated the integrals { x? + a* dx and 
ip Tea and that his results could be immediately converted to analytic form.** 
x“+a 


Roger Cotes included in his tables of integrals those of the form 


[ Ro.nax with t= Vax?+bx +c, 


where the integrand was a rational function of x and t.*? Clearly, seventeenth- and 
eighteenth-century mathematicians knew how to handle such integrals. But Johann 
Bernoulli pointed out in his very first lecture on integration, contained in vol. 3 of his 
Opera Omnia, that there was another method, related to Diophantine problems. By a 
substitution used in the study of Diophantine equations, the integral { R(x,t) dx could 
be rationalized. At the end of his lecture, Bernoulli illustrated this idea by means of 
an example:>? His problem was to integrate a> dx : x/ax — x*; his method was to 
rewrite the quantity within the root as a square containing x and a newly introduced 
variable. In this case, he had ax —x? = ax” : m?.Thus,x = am? : (m?+a?), dx = 
2a3mdm : (m? +.a?)* and 


| adx i 2a3dm__—_2a° 
xVax — x? m? m 

We note that a general substitution of the form ax? + bx +c = (ut ax)* could 
be used to rationalize integrals involving Vax? + bx +c. 


48 Barrow (1916) pp. 160-161 and p. 185. 
49 Gowing (1983) pp. 46-48. 
50 Joh. Bernoulli (1968) vol. 3, p. 393. 
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12.9 Exercises 
(1) Prove Newton’s formula (12.3) by showing 
“ Gee 1 xV/2 
(i) [ Tap dt = v2 arctan eee 


(ii) [tha x, x PX? x? xi! ee eee 
ii = ---», forO<x <1. 
go. bap ieee ee Ci Oe Cai 


Use Newton’s factorization (12.19). 

In his letter to Oldenburg dated October 24, 1676, Newton remarked that 
Leibniz’s series and his own variant of it were unsuitable for the approximate 
evaluation of z: “For if one wished by the simple calculation of this series 
14 : 4 ; 5 + etc. to find the length of the quadrant to twenty decimal 
places, it would need about 5000000000 terms of the series, for the calculation 
of which 1000 years would be required.” He recommended his series for 


arcsin for this purpose. He suggested another formula to evaluate z: 


(2 


wm 


x aa a a’ 
= + etc. 
4 1 3 5 7 
He age a gle el. eT 
etc. 
1 3 5 7 9 11 
A BIO malO = 220. a8 
etc., 


1 3 5 7 9 


where a = 5. Prove this formula and show that it is equivalent to 


1 1 1 4 1 1 
— = arctan { arctan { arctan 
4 2.2 f= 2 8 


Also prove Newton’s formula: 


IT 
toto pth Bt os apt ee ge) 


(3 


wm 


Derive Newton’s equations for n defined by (12.18) when m = 3, 4, 5, 6, 7, 
8, 9, 10, 11, and 12. For these values of m, Newton had, respectively, 


nn—1=0, n?>—2n =0, n* —3nn+1=0, n> —4n? + 3n =0, 
n® —5n*+6nn—1 = 0, n’ —6n> + 10n? — 4n = 0, 
n® — 7n® + 15n* —10nn +1 = 0, 
n? —8n! +21n> — 20n? + 5n = 0, 
n'0 _ 9n® + 28n® — 35n* + 15nn — 1 =0, 
n'! — 10n? + 36n’ — 56n° + 35n? — 6n = 0. 
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Use Newton’s equation for m = 7 to show that the cubic equation satisfied 


by 2cos 2 is x3 + x? 2x -1=0. 


(4) Prove Newton’s integration formula (12.1). 


(5) Show that fore > Oand f <0 


d 1 R 1 1 
/ f Rig aie 
d+ fx? S i 
where R = —F T=x, S= .fx?+ F: Then show that for e > 0 and 
f>0 
/ dx 1 x 
——~ = —Rarctan —, 
e+fx2 e R 


where R = [+. Compare these results with comments on Cotes’s notation 


for integrals of rational functions in the preliminary remarks for this chapter. 


(6) Prove that 


wm 


if dx 7 
et fx2+gxt 
adx Bdx 


: 2 2 
(1) (when 4eg < f* andb* = a | cme haa 
can be determined in terms of e, f, g, 
(a + yx) dx (6B + €x) dx 
b+nx + mx? b—nx +nx2’ 
where a, 8, y,€,m,n can be determined in terms of e, f and g. 


, where a, 6,m,n 


(ii) (when 4eg > f? andb? = e) / 


Bernoulli published an entertaining paper containing this result in the 
Acta Eruditorum in response to a challenge from Brook Taylor. See Joh. 
Bernoulli (1968) vol. II, p. 409. 


(7) Use Hermite’s reduction formula (12.81) to show that the integral 


ee Dig? Ano = 347.3 
dx 
(x7 —x +1)? 
has only the rational part — th and no transcendental part. See 


G. H. Hardy (1905) pp. 14-15. 
(8) Prove that 


2n—2 dx _ qui be3*-# Ca 3) / dx | 
¢ | woe y ae | paw fn-10). 
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where 
2k 


— Qk) a 
fi@) =a (13 x 5 2A a): 


See Hermite (1905-1917) vol. 3, p. 50. 
(9) Show that 


/ (1 —x)dx 
x4(2x — 1)3(3x — 2)2(4x — 3) 


1 y) 1879 24499 8 48 
= CG f Inx { t 
36x3 18xx =432x 1296 (2x—1)? 2x-1 
729 3645 2048 
—272 In(2 1)4 In(3x — 2) 4 In(4x — 3). 
MW eayeoy Mee et ae 


See Eu. I-17 p. 165. 
(10) Show that 


1 
React +x) 2 in(14 2x cos 4+ 3°) 
2 AD 2m 
+ — sin 7 arctan =i 2 ) 
5 5 1+ xcos (2) 
1 
5 cos — In (1 Dy cos — 5 +x 2} 
2 sin (2 
+ = sin — arctan (5) 
5 1 — x cos (z) 
xn 1+V5 i 2V5) 
cos — = , sin 
5 4 5. 
2n = -1+V5 . Qn _ sans 23) 
cos = , sin 
5 4 5 4 


Also show that 


i ae AA jp Me eae weioun 
— t x 
1+ x® 12 1—xJ/34 x2 3 
iv i x 1 ; x 
— arctan } arctan A 
6 2x73 6 2-—xV3 
and note that this implies 
gE TS adie deo il V3 2+v3 
ae ae eee eee Se a 


See Eu. I-17 pp. 131 and 120, respectively, for the two formulas. 
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(11) Show that for 0 < m < 2k, 


ie x™-ldx msin(1— #)6 
0 1+2xkcos6+x2* — ksin@ sin 2% * 


See Eu. I-18 p. 202. 
(12) Show that forO <m <k 


oo: ym—l gy 4 m 
i +x" ante IT (55). 


See Eu. I-18 p. 188. 
(13) Show that 


ie dx = 1 P,,(a) 
0 (x4 + 2ax? + yer! - 2m+3:(q ate 1ts es 


where 


miree $n (OP) (Cpa 


k=0 


See Moll (2002) or Boros and Moll (2004) p. 154 for this example and for 
some intriguing open questions related to the integration of rational functions. 


12.10 Notes on the Literature 


The Acta Eruditorum was founded in 1682 by Otto Mencke (1644-1707), a Leipzig 
professor of moral and political philosophy. Leibniz and the two elder Bernoullis 
published many of their papers in this journal. Newton’s two letters were first 
published in full in Commercium Epistolicum D. Johannis Collins, et aliorum de 
analysi promota of 1712. Newton had these letters published in order to document his 
priority in the calculus, partially in response to Leibniz’s 1705 review of Newton’s 
“Tractatus de Quadratura Curvarum” in which Leibniz claimed priority for the 
calculus. In his review, Leibniz charged Newton with appropriating his results by 
changing Leibniz’s differentials to fluxions. Smith (1959), on pp. 440-454 of vol. IL, 
gives an English translation of parts of de Moivre’s works on factorization and 
related topics. 


13 


Difference Equations 


13.1 Preliminary Remarks 


Difference equations occur in discrete problems, such as are encountered in proba- 
bility theory, where recursion is an oft-used method. In the mid-seventeenth century, 
probability was developing as a new discipline; Pascal and Huygens used recursion, or 
first-order difference equations, in working out some elementary probability problems. 
Later, in the early eighteenth century, Niklaus I Bernoulli, Montmort, and de Moivre 
made use of more general difference equations. By the 1710s, it was clear that a 
general method for solving linear difference equations would be of great significance 
in probability and in analysis. Bernoulli and Montmort corresponded on this topic, 
discussing their methods for solving second-order difference equations with constant 
coefficients. In particular, they found the general term of the Fibonacci sequence. 
In 1712, Bernoulli also solved a special homogeneous linear equation of general 
degree with constant coefficients. He accomplished this in the course of tackling the 
well-known Waldegrave problem, involving the probability of winning a game, given 
players of equal skill.! Then in 1715, Montmort rediscovered and communicated to 
de Moivre Newton’s transformation, (10.3). This revealed the connection between 
difference equations and the summation of infinite series.” It was an easy consequence 
of the Newton—Montmort transformation formula that the difference equation 


A" Ak = Antk — @. An+k—-1 + (5) An+k-2 — +++ + (-1)" Ax = 0, (13.1) 


k = 0,1,2,..., implied that the series ar Axx* was a rational function with 
(1 — x)” as denominator. More generally, de Moivre called a series recurrent if its 
coefficients satisfied the recurrence relation 


aoAntk + a1 Ante-1 ++++ + an Ax = 0, (13.2) 


! See Hald (1990) pp. 375-393. 
2 Montmort (1717). 
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where do, d1,...,d, were constants and k = 0,1,2,.... De Moivre was the first to 
present a general theory of recurrent series. He proved that such a series could be 
represented by a rational function and showed how to find this function. He then 
applied partial fractions to obtain the general expression for A, in terms of the 
roots of the denominator of the rational function. De Moivre was therefore the first 
mathematician to solve a general linear difference equation with constant coefficients 
by generating functions. He expounded this theory without proofs in the first edition 
of his Doctrine of Chances, published in 1717,> but he provided proofs in his 1730 
Miscellanea Analytica.* 

In the 1720s, several mathematicians turned their attention toward recurrent series. 
Daniel Bernoulli (1700-1782) made some very early investigations into this topic 
without making much headway, being unaware of the results of de Moivre, Niklaus I 
Bernoulli, and Montmort. In his Exercitationes of 1724, he stated that there was 
no formula for the general term of the sequence 1, 3, 4, 7, 11, 18,.... Niklaus 
informed his cousin Daniel that this was false and that the general term should be 


v5 ter (54)" 


. Apparently, Daniel Bernoulli subsequently became familiar 


with the work of de Moivre and others. At the end of 1728, he wrote a paper 
explaining the method, still contained in our textbooks, for giving special solutions 
for homogeneous linear difference equations with constant coefficients, in which the 
form of the solution is assumed and then substituted into the equation. The values of 
the parameter in the assumed solution can then be determined by means of an algebraic 
equation. He obtained the general solution by taking an arbitrary linear combination 
of the special solutions.° 

Though the connection between differentials and finite differences had become 
clear by 1720, simultaneous advances in the two topics did not occur; one area seemed 
to make progress in alternation with the other. For example, D. Bernoulli’s 1728 
method of solution was not matched by a similar advance in the area of differential 
equations with constant coefficients until 1740. Euler, having defined the number e 
and the corresponding exponential function, then gave the general solution of a 
differential equation as a combination of special exponential functions. He used the 
exponential function as the form of the solution in giving a method for solving a 
differential equation with constant coefficients.° 

Then, from the 1730s through the 1750s, the theory of differential equations made 
great strides, partially due to the application of this subject to physics problems, 
such as hanging chains and vibration of strings. In fact, d’Alembert, Euler, and 
Clairaut initiated the study of partial differential equations in this context. However, 
no corresponding progress took place in the area of difference equations until 1759 
when Lagrange had the inspiration of applying the progress made in differential 
equations to difference equations.’ He found a technique, analogous to d’ Alembert’s 


3 Fora reprint of the third edition, see de Moivre (1967). 
4 de Moivre (1730a). 

5 Bernoulli (1982-1996) vol. 2, pp. 49-64. 

6 Bu. 1-22 pp. 108-149. E 62. 

7 Lagrange (1867-1892) vol. 1, pp. 23-36. 
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method for differential equations, for solving a nonhomogeneous difference equation 
by reducing its degree by one. A repeated application of this technique reduced a 
general nth degree difference equation to a first-degree equation, already treated by 
Taylor in 1715.8 Similarly, Lagrange adapted his method of variation of parameters for 
solving differential equations to the case of difference equations. In the 1770s, Laplace 
published several papers, extending Lagrange’s method and using other techniques to 
solve linear difference equations with variable coefficients. In a paper written in 1780 
and published in 1782,° he introduced the term “generating functions.” He developed 
this theory by the symbolic methods introduced in 1772 by Lagrange, courtesy of 
Leibniz. Laplace also applied generating functions of two variables to solve partial 
difference equations. ! 

De Moivre’s work on recurrent series also contained interesting, albeit implicit, 
results on infinite series, a topic of his earliest research and a life-long interest. In the 
1737 second edition of his Doctrine of Chances, he solved the problem of summing 
the series pean Anp+kx", where p and k were integers and 0 < k < p, when the sum 
of paar, dnx" was known. In his solution, de Moivre dealt only with recurrent series, 
but in 1758, Simpson published a paper tackling the general problem. Even a year 
earlier, Waring also gave a general solution for summing the series )-V° 9 dnp+ix". An 
expert in the area of symmetric functions, he obtained his solution by taking specific 
symmetric functions of roots of unity. He did not publish the paper, but communicated 
it to the Royal Society. Waring later wrote that he believed Simpson’s proof was based 
on this result, since Simpson was an active member of the society. 

Another of de Moivre’s results on series implied that if the recurrent series )7 an.x” 
and > b,x” had singularities at a and £, respectively, then )° a,b,x" had a singularity 
at wB. In 1899 Hadamard extended this result to arbitrary power series,'! though it 
does not seem that he was motivated by de Moivre’s theorem. 


13.2. De Moivre on Recurrent Series 


In his Doctrine of Chances, de Moivre wrote that the summation of series was 
required for the solution of several problems relating to chance, that is, to probabilistic 
problems.!* He then presented a list of nine propositions connected with recurrent 
series, series whose coefficients satisfied a linear recurrence relation. Thus, a series 
ee: a,x" was a recurrent series if there were constants 1,2, ...,a@% such that a, 
satisfied the difference equation 


Ayn + OAn—1 + 02Qn—2 + +++ + OeAn—~ = O, (13.3) 


8 Taylor (1715). 

9 Laplace (1782) p. 5. 
10 For the early history of generating functions, see Seal (1949). 
!l Hadamard (1899). 
12 de Moivre (1967) pp. 220-229. 
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forn =k,k+1,k+2,....De Moivre started with the example of the series 


1+ 2x + 3xx + 10x? + 34x4 49725 4---, (13.4) 


whose coefficients satisfied the equation 


An — 3dn—1 + 2an—1 — 5ayn—3 = O, (13.5) 


for n = 3,4,5,.... His terminology is no longer used. For example, instead of the 
recurrence relation (13.5), he called 3x — 2xx + 5x, or simply 3 — 2 +5, the scale of 
relation of the series. This scale of relation was used to sum the series. Thus, if we let 
S denote the series (13.4), then 
—3xS = —3x—6x" — 9x3—30x4 — 102x7 —---, 
+2x?§ = 2x? + 4x3 46x44 20x5 +--- : 
—5x°$ = 5x 10x" = 15x" 


Now add these three series to the original series (13.4) for $. Because of the recurrence 
relation (13.5) satisfied by the coefficients, or because of the scale of relation of the 
series, we get 


(1 —3x + 2x? —5x3)S = 1—x—x?. 


All other terms on the right-hand side cancel and we have the sum of S: 


ne ee a: 


S= ; 
1 — 3x + 2x2 —5x3 


(13.6) 


De Moivre called the expression in the denominator the differential scale, since it 
was obtained by subtracting the scale of relation from unity. 

De Moivre’s purpose in summing S was to find the numerical value of the 
coefficient a, of the series, or, in modern terms, to solve the difference equation 
(13.5). Once he had S, he could factorize the denominator and obtain the partial 
fractions decomposition of the rational function S. Actually, he did not discuss this 
algebraic process in his book. He merely noted the form of a, when the denominator 
was a polynomial of degree m with roots @1,@2,...,@, in the cases m = 2,3,4. 
Moreover, he wrote the solutions for only those cases in which the roots were distinct. 
One can be sure that he knew how to handle the case of repeated roots, because 
only a knowledge of the binomial theorem for negative integral powers was required. 
Thus, all series peer anx", whose coefficients a, satisfy a linear difference equation 
with constant coefficients as in (13.3), must be rational functions. Conversely, the 
power series expansions of rational functions whose numerators are of degree less 
than the corresponding denominators are recurrent series. Euler devoted a chapter of 
his Introductio in Analysin Infinitorum to recurrent series.!? Since de Moivre gave 


13 Buler (1988) chapter 17. 
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very few examples, we consider two examples from Euler’s exposition, illustrating 
the method of using generating functions to solve linear difference equations such as 
(13.3). In the first example, the recurrence relation was the same as the one satisfied 
by the Fibonacci sequence, though the initial values were different. The coefficients 
of the series 


CO 
SS ax 1 Be Sat tT pe es 1B 90 aT yt ee 
n=0 


satisfied the recurrence relation a, = dy,_1-+d,—2 forn > 2. Note from our discussion 
of de Moivre’s work that the above series would sum to a rational function whose 
denominator would be 1 — x — x?. In fact, the sum was 


5 
mee (Bae) (Es 


1+0/5 1-J/5 
1+ 2x es 5 
1= 


Hence Euler had the solution of the difference equation: 


: 1445 ae 1/5 n+l 
Qn = 5 5 : 


In the second example, there were repeated roots as well as complex roots. Euler 
explained earlier in his book precisely how to obtain the partial fractions in this 
situation. The difference equation would be 


An — An—1 — An—2 + An—4 + Gn—5 — An—6 = 9, (13.7) 


and the initial conditions would yield the sum of the series as 


1 1 
l—x—x24+x44x5-—%6  (1—x)3( + x)(1+x +x?) 
1 1 17 1 2+x 


~ 60 —x)3 40 —xy TY x) Sd 4x) Saxe (13.8) 


Euler obtained the general term a, by expanding the partial fractions using the 
binomial theorem. Thus, he had 


moon 47 1 | 4sin SS -2sin 4 
i eee ees ae. 9/3 ; 


where a positive sign was used for n even and negative sign for n odd. 

In his Doctrine of Chances, de Moivre stated a few specific examples but did not 
work out details for obtaining the general term. Of his nine theorems, the first six 
dealt in general terms with the ideas in the above two examples from Euler. The last 
three propositions applied to more general series, though de Moivre worked wholly 
in terms of recurrent series. In the seventh proposition, on the even and odd parts of 


(13.9) 
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a rational function, de Moivre supposed that }°°° 9 a,x” was a recurrent series and 
hence representable as a rational function. He then gave a method for representing the 
even series prea d2,X7" and the odd series pan, A2n+ ,;x2"+1 as rational functions. In 
this connection, he explained that if A(x) was the denominator, or differential scale, 
for )77° 9 dnx”, then the common differential scale for the two series with the even 
and odd powers was the polynomial obtained by eliminating x from the equations 
A(x) = 0 and x? = z. More generally, de Moivre wrote that if 


ao tax tax? +---= Aan (13.10) 

then the m series 
ajxF + amy jx TF tamer ti +.-., 7 =0,1,...,.m—-1, (13.11) 
had the common differential scale obtained by eliminating x from A(x) = O and 
x” = z, We may state a more general problem: Given f(x) = )°0°9 dnx”, express 


e(x) = eo dems in terms of values of f(ax), a a root of unity. This was 
the problem solved by Simpson and Waring in the late 1750s. The essence of their 
method was to use appropriate mth roots of unity and those roots were implicit in de 
Moivre’s use of the equation x” = z. 

The eighth proposition of de Moivre explained how to find the differential scale 
for }°(an + bn)x" when the differential scales of }° a,x" and >> b,x" were known. 
This was straightforward. In the very interesting ninth proposition, de Moivre worked 
out the differential scale for 5° a,b,x", but only for the case where the scales for 
Yo anx" and >> b,x" were quadratic polynomials. His result is stated as an exercise at 
the end of this chapter. An immediate consequence of this result is that )* a,b,x” has 
a singularity at wB if }> a,x” and )° b,x" have singularities at a and £ respectively. 
In 1899, Hadamard, probably unaware of this result of de Moivre, stated and proved 
a beautiful generalization, usually called Hadamard’s multiplication of singularities 
theorem: !* If ye 9 Anz” has singularities at a1,a2,..., and °° 9 byz” at Bi, Bo, ..., 
then the singularities of °° 5 anbyz” are among the points ; Bj. 


13.3. Simpson and Waring on Partitioning Series 


In 1758, Thomas Simpson gave a solution of the general problem of determining the 
values of the m series in (13.11) when (13.10) was replaced by a general series 


f(x) =ap + a,x 4 anx + a3zx7> te. (13.12) 


Simpson’s paper,!> “The invention of a general method for determining the sum 
of every 2d, 3d, 4th, or 5th, &c. term of a series, taken in order; the sum of the 
whole series being known,” was published in the Philosophical Transactions of 


14 Hadamard (1899). 
!5_ Simpson (1759). 
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the Royal Society. To accomplish the task of his paper, Simpson employed roots of 
unity as well as some theory of symmetric functions. In fact, one can avoid the use 
of symmetric functions here, but the origins of this topic are relevant, to the extent that 
Simpson employed it in his arguments. 

In his 1629 book, Invention nouvelle en l’algébra, Albert Girard (1595-1632) 
defined the elementary symmetric functions: !® 


When several numbers are proposed, the entire sum may be called the first faction; the sum of 
all the products taken two by two may be called the second faction; and always thus to the end, 
but the product of all numbers is the last faction. Now there are as many factions as proposed 
numbers. 


Thus, if 71,02, ...,@, aren quantities, the elementary symmetric functions of these 
n quantities would be defined by 


n 
o1= ) Qj, 02 = ) Oj OL}, 03 = ) Aj jk, 8+ On = 1A2°+*Ay. 


i=l l<i<j<n 1l<i<j<k<n 


Girard referred to oj; as the first faction, 02 as the second faction, and o,, as the last 
faction. This concept has connections with algebraic equations: 


(% — @1)(% — 029) ++- & — On) =x" — of) + 9.x"? — 3x" > +--+ (-1)" on, 
(13.13) 


a result verifiable by induction. 
Girard explained the connection between the elementary symmetric functions of 
the roots of a polynomial and the coefficients of the polynomial: 


Every algebraic equation except the incomplete ones admits of as many solutions as the 
denomination of the highest quantity indicates. And the first faction of the solutions is equal 
to the number of the first mixed quantity, the second faction of them is equal to the number of 
the second mixed quantity, the third to the third, and so on, so that the last faction is equal to the 
closing quantity—all this according to the signs that can be noted in the alternating order. 


Now the symmetric functions result invoked by Simpson maintained that the sum 
of a given power of roots 


Sm =a’ +a5'+--++a/", ma positive integer, (13.14) 
must be a polynomial in 01,02, ...,0,. Girard stated this result as: 


It might seem to some that the factions would be also explicable otherwise than above. That 
instead of saying the sum, the products two by two, the products three by three, etc., one could 
say more simply, the sum, the sum of the squares, the sum of the cubes, etc., which however is 
not so, for when there are several solutions, the sum will be for the first mixed quantity, the sum 
of the products two by two for the second, etc., as has been sufficiently explicated. But it is not 
the case for any factions of the powers that someone might offer. 


For the Girard material, see Girard (1884), a later printing in which no page numbers seem to be given; also 
see translation by Robert Smith, in de Beaune et al. (1986). 
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Example: Let: 


A be the first quantity, 
B the second, 

C the third, 

D the fourth, 


etc. 
Then, in every type of equation, 
A 
Asq — B2 


Acub — AB34 C3 

Asq -sq — AsqB4 + AC4 + Bsq2 — D4 
will be the sum, respectively, of the 
solutions 

squares 

cubes 


Square-squares. 


In modern notation, using (13.14), Girard had 


Ss, =o], N= a; —202., 3 = a; — 30102 + 303, 


S4 =o; 4o702 + 40103 4 a5 4ou, (13.15) 


a result rediscovered by Newton in 1665—1666.!’ Newton had learned algebra by 
reading Oughtred, Viéte, and Descartes; he read Descartes in van Schooten’s Latin 
translation. Thus, it appears that Newton’s discovery was independent.'* Moreover, 
Newton also provided a recurrence rule for determining sz when 51,52, ...,S,—1 were 
known. In modern notation, this rule is 


Sk — O1Sp_1 + o7Sp-2 — ++ + (— 1b og_ 151 + (—D*k 0% = 0. (13.16) 


Newton stated his rule within his earliest researches in algebra.!? He included these 
and other algebraic results from 1665-1666 in his lectures on algebra, given in the 
1670s and early 1680s, published in 1707 as Arithmetica Universalis.*° A proof of 
the rule was not published by Newton, but was given by Maclaurin in his posthumous 
Treatise of Algebra.”! Euler gave two different proofs in 1750.77 

In his Meditationes Algebraicae, Edward Waring (c. 1736-1798) accused Simpson 
of stealing his idea for partitioning series into n parts. The English translation by 


!7 Newton (1967-81) vol. 1 pp. 517-520. 

18 ibid. p. 518, footnote 12. 

19 ibid. p. 519. See also footnote 15. 

20 Fora reprint of this book, see Newton (1964-67), vol. 2. 
21 Maclaurin (1748). 

22 Eu. 1-6, pp. 20-30. E 153. See also pp. 263-286. E 406. 
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Dennis Weeks of the 1782 edition of Waring’s Meditationes presents Waring’s point 
of view: 


De Moivre gives a method of generating the sum of a series of terms that are equal, or alternating, 
or cyclic over an interval of distance two, three, or more, a + bx + cx2 fee: through division of 
unity by a rational multinomial expression p + qx + rx? +--+, In 1757 I sent the first version 
of this work to the Royal Society of London, which Simpson read, then in 1758 he inserted in the 
Philosophical Transactions a short piece containing a rule that was in the work I submitted, viz 
let S be a given function of the quantity x, which is expanded into a series proceeding according 
to the dimensions of x, say a + bx + cx? + dx? + ---; in S now substitute for x respectively 
ax, Bx,yx,dx,... where a, B,y,6,... are roots of the equation x” — 1 = 0, resulting in a total 
of n quantities A, B,C, D, etc., then a will be the sum of the first term and of those 
whose position is respectively n,2n,3n, etc. beyond the first term. Nevertheless at the end of his 
paper he says of the series a + bx + cx? +--+ that he has given a solution of an example of this 
problem by a method which differs a little from another one where a general method is indicated. 
But I say that no one, before my submission to the Royal Society in 1757, had ever claimed to 
have devised a general method, and it was my notes that Simpson had read, in which the above 
method was contained. 


It is possible that Waring could have been mistaken. Though not formally trained, 
Simpson was an able mathematician, capable of conceiving of the idea of partitioning 
series. Also, it is not at all clear that Waring’s communication to the Royal Society was 
read by Simpson, or that it was indeed available to him. However, Waring provided an 
important result, allowing one to calculate s,, directly, stated and proved as the first 
result in his Meditationes: 


ky thko+e-+hn (ky Pho 42 Fk = DP og oe kin 
Sr cae ae aecia klkol-+-Km! Gy Oy PO; 
(13.17) 


where the sum was to be taken over all kj, ... , km such that kj +2ko+---+mky =m. 
The number of terms in the sum yielded the number of partitions of m, denoted by 
p(m). A definition of partitions is provided in our Chapter 26. To get an idea of the 
usefulness of Waring’s formula (13.17), we here apply it to efficiently derive s¢; note 
that it took Newton much more effort to do this in 1665-1666. 

We first find all eleven solutions of 


ki =6 h=4k=15 h=3,Kh=1; 
ky =2,k2=2; ky =2,k4=1; ky =1, k5 = 1; 
k=lo=1,k=1; k=3; 


kg=l_kg=1; kg =2; kg=1. 


23 Waring and Weeks (1991) pp. xlii—xliii. 
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Note that these solutions indicate that p(6) = 11. Now observe that the solution 
k; =3, k3 =1, for example, leads to the term 


3! 
6+3+ 3 
6(—1)§*3 Ti 7 Ca 03 = 60; 03 


and the complete formula for s6 is given by 


S= of — 60} 02+ 60; o3+ 907 os — 607 o4 


— 1201 02 03 + 601 o5 203 + 607 04 4 303 606. 


Returning to Simpson’s method, he gave an illustration of his approach by 
partitioning the series (13.12) into three parts; for numbers p, g, and r, he obtained 


1 1 
3 / (Px) = 3 (40 ay px an px" +a3p>x° ab, --), 


1 1 
3 fx) = 3 (ao ayqx ang? x? + a3q°x° Hh ease i) 


1 1 
3 ffx) = 3 (a0 + ayrx + agr?x? +a3r2x° +--+). 


Simpson wrote that 


(f (px) + f(qx) + f(rx)) = ap + 43x? + agx® ++: 


Wile 


if 
ptatr=0, pPt+q’?tr’=0, Pt+etr=3, ptt+qt+r*=0, etc. 
Thus he required that 
pk + qk +r* =0 when 3tk 
and 


pk +q* +r* =3 when 3|k. (13.18) 


He noted that the relations in (13.18) would hold if p,qg,r were the roots of the 
equation 


x3-1=0. 


He observed that the methods of algebra showed that p +g +r = 0 and p* + 
q’ +r? = 0, presumably since the elementary symmetric functions p + g + r and 
pq + pr +r rq of the roots p,g,r were zero. He also noted that p + q +r = 3, 
py qt + 4 =p+q4 r, p> q ppd = p> +q° +r’, and so on. In this way, he 


verified his result for the case where the series was trisected. 
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On the general case, Simpson wrote:*4 


And, by the very same reasoning, and the process above laid down, it is evident, that, if every 
nth term (instead of every third term) of the given series be taken, the values p,q,r,s, &c. will 
be the roots of the equation z” — 1 = 0; and that, the sum of all the terms so taken, will be truly 
obtained by substituting px,qx,rx,sx, &c. successively for x, in the given value of S [the sum of 
the series], and then dividing the sum of all the quantities thence arising from the given number n. 


In order to verify Simpson’s contention for the general case, one must show that, 
given p1, p2,..-, Pm as the m roots of 


x” —1=0, (13.19) 


=m when k =m. 


Note that the coefficients of 01,02, ...,0m—1 are all zero, while o,, = 1. It follows 
from Newton’s relation (13.16) that fork < m 


Sk = 01 Sp—-1 — 02 Sp_a +--+ (—Dog_p 1 + (- Dk oy = 0. 
Recalling that s,,. = m, it follows from (13.19) that 
— pvt n Nein, ok k k 
Sn = Py + Py te°3 + Pm = PL Pp ts + Pm 


completing the proof. 
In the middle of his paper, while he was considering some examples, Simpson 


suddenly mentioned that the known values of pj, p2,p3,... were of the form 
a+ J/a? — 1, where w = cos 2mk Thus, if 


Qn . , on 
Pi = cos —— +1 sin — 
m m 


then we can take 


Qnak | . 2k 
Pk = cos —— +1 Sin —_, 
m m 


because, by de Moivre’s theorem, 


( In 2) Qnk |. 2nk 
cos — +1 sin — = cos —— + 1 sin ——. 
m m m m 


This indicates that all the mth roots of unity are given by 1, a, w,...,0" 


= 20 1 i gin 2% 
@ = COS = +i sin =. 


—1 where 


24 Simpson (1759) pp. 759-760. 
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Also, it is clear that 


thon 


l-—aw 


= 0. (13.20) 


This confirms the fact that the theory of symmetric functions can be avoided, as we 
saw in the case m = 3 in Section 2.11, where (13.20) is used in the particular case 
given by (2.65). 


13.4 Stirling’s Method of Ultimate Relations 


Stirling extended de Moivre’s recurrent series method to sequences satisfying differ- 

ence equations with nonconstant coefficients. In the preface to his 1730 Methodus, he 
25 

wrote, 


For I was not unaware that De Moivre had introduced this property of the terms into algebra with 
the greatest success, as the basis for solving very difficult problems concerning recurrent series: 
And so I decided to find out whether it could also be extended to others, which of course I doubted 
since there is so great a difference between recurrent and other series. But, the practical test having 
been made, the matter has succeeded beyond hope, for I have found out that this discovery of De 
Moivre contains very general and also very simple principles not only for recurrent series but also 
for any others in which the relation of the terms varies according to some regular law. 


In the statement of the proposition 14, Stirling explained the term ultimate relation:*° 


Let T be the zth term of a series and 7’ the next term; let r,s,a,b,c,d, be constants. 
Suppose that the relation 


(2 taztb)T +5(22 +cz+d)T’ =0 (13.21) 


held between the successive terms. Then the ultimate relation of the terms was 
defined as 


rT +sT’' =0. (13.22) 


Stirling used the term ultimate because he understood that z was a very large 
integer, so that az + b and cz + d could be neglected in comparison to 7”. This made 
it clear that (13.22) followed from (13.21). Similarly, if the equation were 


r(z+a)T +s(z+b)T' +t(¢+c)T” =0, (13.23) 
then the ultimate relation would be 


PPT aT, (13.24) 


25 Stirling and Tweddle (2003) p. 18. 
26 ibid. p. 88. 
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In modern notation, if )> A, is the series, then (13.23) takes the form 
r(k + a)Ag + s(k + b)Aggi + t(K + c)Argy2 = 0. (13.25) 
Stirling stated his theorem as proposition 14: 


Every series A+ B+C+D+E +&c. in which the ultimate relation of the terms is rT + sT’ + 
tT” = 0 splits into the following 


wheren =r+s+t and 


A2=rA+sB+tC, A3 =rA2+sBy+1C2, Aq =raA3+5B34+1C3, &e. 
By =rB+sC +tD, Bz =rBy +sCy+tD», By =rBz+5C3+tD3, &c. 
C2 =rC+sD+tl, C3 =rCo+sD24+tE2.Cy=rC3+5D34+1E3, &c. 
Dy =rD+sE+4+tF, D3 =rD) +sE)+tFy, Dy =1D3+5E3 4th, &e. 
Ey =rE+sF +tG, £3 =rE2+s8sFo+tGo, E4=rk3+5F3+1tG3, &c. 
&c. 


This result generated some interest in its time. A reviewer of the Methodus wrote 
in 1732 that the result was very powerful and complicated. In a letter to Stirling dated 
June 8, 1736, Euler wrote2’ 


But before I wrote to you, I searched all over with great eagerness for your excellent book on the 
method of differences, a review of which I had seen a short time before in the Actae Lipslienses, 
until I achieved my desire. Now that I have read through it diligently, I am truly astonished at the 
great abundance of excellent methods contained in such a small volume, by means of which you 
show how to sum slowly converging series with ease and how to interpolate progressions which 
are very difficult to deal with. But especially pleasing to me was prop. XIV of Part I in which 
you give a method by which series, whose law of progression is not even established, may be 
summed with great ease using only the relation of the last terms; certainly this method extends 
very widely and is of the greatest use. In fact the proof of this proposition, which you seem to 
have deliberately withheld, caused me enormous difficulty, until at last I succeeded with very 
great pleasure in deriving it from the preceding results, which is the reason why I have not yet 
been able to examine in detail all the subsequent propositions. 


Stirling gave three examples of this theorem. The first example, similar to the 
second, was the summation of the series 


1+ 4x + 9x? + 16x? + 25x4 + 36x° + ete. (13.26) 


Recall that Euler summed this series in his Institutiones Calculi Differentialis of 
1755 by applying Newton or Montmort’s transformation. Stirling was aware that the 


27 Tweddle (1988) p. 141. 
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series could be summed by that method and mentioned Montmort explicitly. Stirling 
observed that the difference equation for the terms of the series was 


(27 +2z+1)xT —2T' =0. (13.27) 


For example, for the third term, z = 3, T = 9x2, and T’ = 16x?. The ultimate relation 
was xT — T’ = 0, so thatr = x,s = —1,t = 0, andn = x — 1. It followed that 
A=1, Ap = —3x, Az = 2x, A4 = 0, and the series transformed to 


1 1 3x Des = l+x 
(= (x —1)2 | = 


In the third example, he considered the series 


1 — 6x + 27x? — 104x3 + 366x* — 1212x° + 3842x° — 11784x7 + ete. 


defined by the difference equation 


x2(z+4)T —2x(z+2)T’ —zT” =0; (13.28) 
the ultimate relation was 
x°T —2xT'—T" =0. (13.29) 


Hence r = x”, s = —2x,t = —1 andn = x* — 2x — 1. Stirling computed the 


values of the A and B as 
A=1, Ap = —14x7, A3 = 29x*, Ay = 0, 
B = —6x, By = 44x, B3 = —70x°, By = 0. 


Thus, the sum of the series was 


14x? 29x4 
(2x + 1) 
x2—2x—-1 (x2-2x—1)? (x2 —2x —-1)3 
—6x 44x3 10x? 
x2—2x—-1 °° (x2-2x-1)2 (x2 -x-13/° 


Note that in (13.28) z takes the values 2,3,4, ... while in (13.27) z starts at 1. Thus, in 
the second series when T = 1, T’ = —6x, and T” = 27x2, we take z = 2. Stirling’s 
normal practice was to start at z = 1. 


13.5 Daniel Bernoulli on Difference Equations 


In 1728, while at the St. Petersburg Academy, Bernoulli presented to the academy 
a method for solving a difference equation in which the form of the solution was 
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assumed; this particular approach is often given in elementary textbooks.”® Unlike 
Bernoulli, we use subscripts to write the equation 


Gn = 1 Gn—-1 + O2Gn-2 + --- + OKan—x, (13.30) 


with a, @2,..., a@% constants. Bernoulli assumed a, = i”, substituted in the equation 
and divided by 4”~* to arrive at 


RS ay ewe ag (13.31) 


Bernoulli stated that if Aj, A2, ..., Ax were the k distinct solutions of the algebraic 
equation (13.31), then the general solution of (13.30) would be an arbitrary linear 
combination of the particular solutions 4*, that is 


Ay = Ax + Agdk +--+ + Aga, (13.32) 


However, if 41 = Az, then the first two terms of (13.32) would be replaced by 
(A; + A2x)A7. More generally, if a root 4; was repeated m times, then that part of the 
solution (13.32) corresponding to 4; would be replaced by 


(Aj + Ajgix +++) + Am4j—1)4}. 


Daniel Bernoulli considered examples of distinct roots and of repeated roots. He first 
took the Fibonacci sequence 0, 1, 1, 2, 3, 5, 8, 13,..., leading to the difference equation 
An = An—1 + Gn—2 and the algebraic equation 2 = A+ 1. The solutions were A} = 
ES An = ids so that 


dy = AM + Andi. 


To find A; and Az, Bernoulli took x = 0 and x = | to get 


1 1—- 
Art Anand A ( Ho) + ao By a1 


2 2 


Solving these equations, Bernoulli found A; = + and Az = aoe Recall that 


Montmort and Niklaus I Bernoulli in their correspondence of 1718-1719 had already 
solved the problem of the general term in the Fibonacci sequence. 

As an example of a difference equation leading to repeated roots, Daniel Bernoulli 
considered the sequence 0,0,0,0, 1,0, 15,—10, 165, —228, etc., generated by the dif- 
ference equation 


Ay = Oayn—1 + 15dy_—2 — 10an_3 — 60ay_—4 + 72an_s. 


28 Bernoulli (1982-1996) vol. 2, pp. 49-64. 
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He found the roots of the corresponding algebraic equation to be 2,2,2, — 3, — 3; the 
general term of the sequence was then 


(1026 — 1035x + 225xx) - 2* + (224 — 80x) - (—3)* 
9000 ; 


As a final example, Bernoulli set a, = sin nx and applied the addition formula for 
sine to get, in modern notation, 


Qn+1 + a-1 = sin(n + 1)x + sin(n — 1)x = 2cos x sin nx = 208 Xap. 
This produced the algebraic equation 47 —2 cos xA+1 = 0 whose roots were given by 


hy = cos x + Vcos? x — 1 =cos x + V—1sin x, 
h2 = cos x — ¥cos? x — 1 =cos x — V—1 sin x. 


This gave him the formula for sine: 


(cos x + /—1sin x)” — (cos x — /—I1 sin x)” 
2/1 


Note that this gives a new proof of de Moivre’s formula. Bernoulli also made 
an interesting observation about the root largest in absolute value of the algebraic 
equation (13.31), noting that such a root could be obtained from the sequence 
satisfying the corresponding difference equation. Taking A to be the root largest 
in absolute value, and writing the sequence as aj,d2,d3,...,dm,..., he observed 
that as m went to infinity, = would approach the value A. Also, the root smallest in 
absolute value could be found by setting A = ri in (13.31). Bernoulli was quite proud 


dn = sin nx = 


of this result; he wrote a letter to Goldbach on February 20, 1728,2° that even if it were 
not useful, it was among the most beautiful theorems on the topic. Euler must have 
agreed with Bernoulli, since he devoted a whole chapter of his Introductio of 1748 to 
finding roots of algebraic equations by solving difference equations.*” 

Illustrating that his beautiful theorem was in fact useful, Bernoulli showed how to 
find the approximate solution of xx = 26. He began by setting x = y + 5 to get 
1 = 10y+ yy. Of the two roots of this last equation, he needed the smaller in absolute 
value, so he set y = i to obtain z? = 10z + 1. The corresponding difference equation 
was dy = 10dy—1 + Gn—2, and Bernoulli took the two initial values of the sequence to 
be 0 and 1. The difference equation then gave him the sequence 0, 1, 10, 101, 1020, 
10301, 104030, .... To obtain an approximate value of y, Bernoulli took the ratio of 
the seventh and sixth terms of the sequence, obtaining x = /26=5 + ah = 
5.09901951360. He then computed 26 by the usual method and got 5.0990151359. 
Bernoulli employed this idea to find the smallest roots of Laguerre polynomials of low 


29 Fuss (1968) vol. 2, pp. 250-253. 
30 Euler (1988) chapter 17. 
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degree. In his work with hanging chains, the roots of these polynomials yielded the 
frequencies of the oscillations. See Exercises 3 and 4 of Chapter 14. 


13.6 Lagrange: Nonhomogeneous Equations 


In 1759, Lagrange published a method for solving a nonhomogeneous linear differ- 
ence equation with constant coefficients and this method can be seen as the analog of 
d’Alembert’s method for the corresponding differential equation.*! Lagrange started 
with a third-order equation to illustrate the technique. In brief, let the equation be 
y + AAy + BA?y + CA?y = X; set Ay = p and Ap = q so that the equation 
can be written as y+ Ap + Bg + CAgq = X. For the arbitrary constants a and b 
we have 


y+ (A+a)p + (B+ b)q — ady — bAp + CAgq = X. (13.33) 


Choose a and b such that 


b C 
Ay + (A+a)Ap + (B+ b)Ag = Ay + —Ap — —Agq. (13.34) 
a a 


Then A+a = B B+b= c implying that a satisfies the cubic a* + Aa* + Ba+ 
C = 0. Moreover, by (13.34), equation (13.33) is reduced to the first-order equation 


z—aAz=X, (13.35) 
where 


z=yt(At+a)pt+(B+b)q. (13.36) 


The problem is now reduced to solving (13.35). Suppose it has been solved 
for each of the three values aj, a2, a3 of a, obtained from the cubic. Let z1, z2 
and z3 be the corresponding values of z from (13.35). Now we have three linear 
equations 


yt (At+aipt+ (B+ big =21, yt (A+ar)p + (B+ b2)q = 22, 


y+ (At+a3)p+ (B+ b3)q = 23, 


and these can be solved to obtain y = Fz, + Gz2 + Hz3 for some constants F,G, 
and #H. Finally, to solve the first-order equation (13.35), Lagrange considered the more 
general equation 


Ay+ My=N, (13.37) 


31 Lagrange (1867-1892) vol. 1, pp. 23-36. 


13.6 Lagrange: Nonhomogeneous Equations 323 
where M and N were functions of an integer variable x. He set y = uz to get 


uAz+zAu+ Mzu=N. (13.38) 


He let u be such that (Au + Mu)z = 0, or u was a solution of the homogeneous 
part of (13.37). Thus, 


u(x) — u(x — 1) = —M(x — Lu(x — 1) or u(x) = 1 — M(x — 1))u(x —- 1). 
By iteration, 


u(x) = (1 — M(x — 1) — M(x —2))---(1 — M())). 


For this u, equation (13.38) simplified to 


_ ND 
2(x)-za-l= GGT). 
Therefore 
_NG@-D, —N@=-1), N@-2), NG), 
age PO gealy Saeeey ey 


Laplace later observed that (13.37) could be solved directly by iteration: 
y(x) = yx —1I) + N@—1)—-M@— Iy@—-1) 


= (1-— M(x -1))y@-1)+N@—-1) 
=N(x-1)+0-M(x—-1))N@-2)+ C0 -— M(x — 2))y@ — 2) ete. 


As Lagrange himself pointed out, this method could obviously be generalized to 
a nonhomogeneous equation of any order. Of course, the question of solving the 
corresponding algebraic equation of arbitrary degree would be a separate problem. 

Lagrange found another method of solving difference equations, using the device 
of the variation of parameters.>* Again presenting Lagrange’s work in brief, suppose 
we have a third-order difference equation 


Yx43 + Px x42 =e OxYx+1 + Ry yx = Vy. (13.39) 


Let zx, z)., 2” be three independent solutions of the corresponding homogeneous 
equation 


Yx+3 + PyYx42 + OxYx+1 + Ry yx => 0. (13.40) 


The general solution of this equation is Cz, + C’z). + C"z" where C, C’, and C” 
are constants. Now suppose C,, C’, and C’! are functions of x, determined by the 
condition that 


32 ibid. vol. 4, pp. 151-160. 
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We = Cee Cz, + Cle, (13.41) 
is a solution of the nonhomogeneous equation. Changing x to x + 1, we have 


/ / " " 
Yx+1 = Cy41fx41 + Cy eee aa Oe Lan 
lor no hot. Mon 
= CyZx41 + Cy 41 + Cet] h AC 2x41 oT: AC YZ x41 + ACY 2x41: 


Now suppose that Cy, C’., C! are such that 
txt AC, +24, AC, +24, ,ACl =0. 
Then 
Vout = CyZya + Chez + ez: (13.42) 
If in the equation for y,42, we again change x to x + 1, the result is 
Yx43 = Cy 2x43 + SL er + Coe + ACx 2x43 + ON OLE ae + NONE ia 
Thus, 


Yu43 = Cx2x43 + Cyzy43 + Cy 24 43- (13.43) 


We also have an equation for y,+; resembling the equation for y,. If we make a 
similar x — x + 1 change in the equation for y,+1, we can require that 


Sp POAC Ee GAG. F Ty AC, =O, (13.44) 

Multiply equation (13.41) by Ry; multiply equation (13.42) by Q,; multiply 

(13.43) by P,. Now add the results to (13.44). From (13.39) and the fact that z,, z/., 2” 
satisfy (13.40), it follows that 

zx43ACx + 243AC, + 2243 AC, = Vy. (13.45) 


Consider (13.45) together with the two equations, that we required be satisfied 
by AC: 


zx 42ACy + 2), AC, + 24, ACT = 0, 


2x41 ACx + outst AC, 


Thus, we have three linear equations yielding AC,, AC’, and AC{’. Suppose we 
obtain AC, = H,, AC’. = Hj, and AC! = HY’. These first-order equations can be 
solved for Cy, C’., and C”’, and hence we have y, from (13.41). 

As an example of the method of variation of parameters, Lagrange considered a 
nonhomogeneous equation with constant coefficients. In this case, briefly, z, will be 
of the form m* for some constant m. Suppose we have a second-order equation for 
which z, = m* and z’. = mj. Then the equations for AC, and AC, are 


13.7 Laplace: Nonhomogeneous Equations 325 
m*t1 AC, + mt! AC! =0, 
m**? AC, + mit AC) = Vy. 


Solving for AC, and AC{., we have 


Vy 
AC, => 41 
m**'(m — m1) 
V. 
AC, = —— 
mM, (m,; —m) 
therefore 
V.(m* — 1 V.(m* — 1 
G50 ed A Ce eee 
(m — 1)(m — m,)m* (m, — 1)(m, — m)m} 


13.7 Laplace: Nonhomogeneous Equations 


The method given by Laplace for solving a nonhomogeneous equation differed from 
the variation of parameters of Lagrange, but was analogous to Lagrange’s method for 
equations with constant coefficients (13.34). Suppose the equation to be 


Yetn + PeVxetn—1 + OxYr4n—2 +++ + Tryx41 + Ux yy = Ve. 


Laplace assumed that there existed functions p, and qg, such that yy41 = Pyyx + qx. 
This implied 


Yxt2 = PxtiYx41 + Gxtl oc, 


Yx+n = Pxtn—1Yx+n—1 + Gx+n-1- 


Laplace introduced functions a1, a2, ...,@n—1 to obtain 


Yxtn = Px+n—1Yx+n—1 + Ixtn-1 

= (Pxtn—1 — On—1) Yx+n—1 + (On—1 Pxtn—2 — On—2) Vx4+n—2 
+ (On—2Px+n—3 — An—3) Yx4n—3 +++ + 1 Px Vx 
+ 9x4n—1 + On-19x4n-2 +++ + O1Gx. 


He then chose a1, a2, ...,@,—1 Such that 


Py = Pytn—1 — Gn—1, 
Ox = An—| Pxtn—2 — On—2, 


Ry = QAn—2 Pxt+n—3 — An—-3, 


Ty = 02Px41 — 1, 
Uy, = a1 Px. 
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Therefore, p, satisfied 


n-1 n—2 n—3 
] | pxsti = Py] | peri t Ox [| peti tess + px Te + Us: 
i=0 i=0 i=0 


qx Satisfied an equation of order m — 1: 


Ve = Gxtn—1 + On—19x4n—-2 +++: + a1gx. 


By successive reduction, gx could be determined, although the equation satisfied by 
Px was more difficult to handle. 


() 


(2 


wm 


(3) 


13.8 Exercises 


In Proposition VII for recurrent series of the Doctrine of Chances, de Moivre 


showed that if ag + ayx +apx* +--+ = pegs With B(x) a linear function, 
2n+1 


then the two series °° donx2" and \°°° 9 aon4ix sum to a rational 
function with denominator 1 — (f* — 2g)x* + g?x*. If the denominator of 
the original series was 1 — fx + gx? — hx, then the denominator of the 
two series would be 1 — (f* — 2g)x? — (2fh — g*)x* — h?x®. Work out 
the details by following de Moivre’s method described in the text. Extend the 
results to the case where the original series is divided into three parts. See 
de Moivre (1967). 

Simpson showed that if p, g, and r were the three cube roots of unity and 
F@) = Delo dex", then p+ g+r=p?+P+rP=0, p++ =3, 
and 


f (px) + co? ED ican 
n=0 


He also explained how to generalize this to sum 
[o.@) 
Amn+ jX , jJ=O0,1,...,m—1, 
n=0 


by using mth roots of unity. Prove Simpson’s result and obtain the generaliza- 
tion. Compare Simpson’s results with de Moivre’s in Exercise 1. See Simp- 
son (1759). Thomas Simpson (1710-1761) was a self-taught mathematician 
who contributed to the popularization of mathematics and other intellectual 
pursuits during that period in England. He was an editor of the Ladies Diary and 
was one of the earliest mathematics professors at the Royal Military Academy 
at Woolwich. See the excellent account by Clarke (1929). 

In Proposition IX for recurrent series of his Doctrine, de Moivre stated 
that if }> a,x" and )> b,x" have the differential scales 1 — fx + gx? and 
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1—mx-+ px’, respectively, then the differential scale of )~ a, b,x" is 1—fmx+ 
(f?p + mg —2gp)x? — fgmpx> + g* p*x*. Prove this result. Compare with 
Hadamard’s theorem on the multiplication of singularities in Hadamard (1899). 
See de Moivre (1967). 

(4) Solve the recurrence relation dy)45 = dn44 + dn41 — Gn, with ap = 1, a, = 2, 
a2 = 3,a3 = 3,a4 = 4 by recurrent series (generating function) as well as by 
letting ay = A*. See Euler (1988) p. 195. 

(5) Use recurrent series to find the largest root of the equation y> — 3y + 1 = 0. 
See Euler (1988) p. 288. 

(6) Find the smallest root of y> — 6y? + 9y — 1 = 0. This value is 2(1 — sin 70°). 
See Euler (1988) p. 290. 


13.9 Notes on the Literature 


In D. Bernoulli (1982-1996) vol. 1, pp. 133-189, U. Bottazzini has discussed Daniel 
Bernoulli’s early mathematical work, including difference equations, and has put it 
into historical perspective. Hald (1990) includes an interesting chapter on the use of 
difference equations to solve problems in probability theory by de Moivre, Lagrange, 
and Laplace. De Beaune, Girard, and Viéte (1986) contains English translations by 
Robert Schmidt of algebra texts by de Beaune, Girard, and Viéte, giving easy access 
to the methods and notation of early seventeenth-century French algebraists. 
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Differential Equations 


14.1 Preliminary Remarks 


In the seventeenth century, before the development of calculus, problems reducible to 
differential equations began to appear in the study of general curves and in navigation. 
Interestingly, these differential equations were often related to the logarithm or 
exponential curve. For example, Harriot obtained the logarithmic spiral by projecting 
the loxodrome onto the equatorial plane.! And in 1638, I. F. de Beaune (1601-1652) 
posed to Descartes the problem of finding a curve such that the subtangent at each 
point was a constant.” Note that the problem actually leads to the simple differential 
equation dy = z. Descartes replied with a solution involving the logarithmic function, 
though he did not explicitly recognize it.? In a paper of 1684, Leibniz gave the first 
published solution by explicitly stating the problem as a differential equation.* 

Newton understood the significance of the differential equation as soon as he started 
developing calculus. In his October 1666 tract on calculus, written a year after he 
graduated from Cambridge, he wrote® 


If two Bodys A & B, by their velocitys p & q describe y° [the] lines x and y. & an Equation bee 
given expressing y® relation twixt one of y® lines x, & y® ratio £ of their motions g & p; To find 
the other line y. Could this ever bee done all problems whatever might be resolved. 


So Newton’s problem to solve all problems could be stated as follows: Given 


tf 3) = 0, find y. In a treatise prepared five years later, Newton gave a 


classification of first-order differential equations ay = f(x,y). 
In the 1660s, Isaac Barrow and James Gregory too dealt with differential equations, 
arising from geometric problems. Gregory considered the question of determining a 


! See Pepper (1968). 

2 Descartes (1897-1913) vol. IV, pp. 229-230. 
3 ibid. vol. II, pp. 514-517. 

4 Leibniz (1684); also see Scriba (1961). 

5 Newton (1967-1981) vol. 1, p. 403. 
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curve whose area of surface of revolution produced a given function. This translates 


to the differential equation 
dy 2 
y (1+ (2) = f(x), 
x 


where f(x) is the given function. In connection with this, Barrow gave a geometric 
solution for the differential equation 


by expressing the solution in terms of areas under hyperbolas.’ Geometrically, the 
problem would be to find a curve y = f(x) such that the sum of the ordinate y and 
the subnormal yy’ is a constant. 

By the 1690s, it was clear to Newton, Leibniz, the two Bernoullis, and the 
other mathematicians with whom they corresponded that differential equations were 
intimately connected with curves and their properties. If they knew some property of a 
geometric object, such as the subtangent or curvature, then the problem of finding the 
curve itself usually led them to a differential equation. They had begun to recognize or 
get a glimpse of some general methods of solving these equations, such as separation 
of variables and multiplication of the equation by an integrating factor. 

Newton encountered differential equations in the geometrical and astronomical 
problems of the Principia; in his De Quadratura Curvarum of 1691-1692, Newton 
once again emphasized the importance of differential equations, or fluxional equa- 
tions, discussing a number of special methods for solving them, as well as the general 
separation of variables method. He wrote, “Should the equation involve both fluent 
quantities, but can be arranged so that one side of the equation involves but a single one 
together with its fluxion and the other the second alone with its fluxions.’® The term 
separation of variables was first used by Johann Bernoulli in his letter to Leibniz of 
May 9, 1694, and then in a related paper published in November 1694.? Bernoulli also 
noted that there were important equations unable to be solved by this method, such as 
aady = xxdx + yydx. Observe that this is an equation between the differentials 
dx and dy; it is hence given the name “differential equation.” We would now write 
it as ae 2 = x* + y’, a particular case of Riccati’s equation, to which we will 
return later. 

The first person to discover the integrating factor technique seems to be the Swiss 
mathematician Nicolas Fatio de Duillier (1664-1753). In 1687, he mastered the 
elements of differential and integral calculus by his own unaided efforts.!° Since 
so little on this subject had been published, Fatio’s achievement was remarkable. 


6 Turnbull (1939) pp. 167 and 174. 

7 ibid. p. 167. 

8 Newton (1967-1981) vol. 7, p. 73. 

9 Bernoulli and Leibniz (1745) pp. 5-9, especially p. 7 and Bernoulli (1742) vol. 1, pp. 123-128. 
Newton (1967-1981) vol. 7, pp. 78-79, footnote 68. 
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He exchanged several letters on calculus with Huygens, to whom in February 1691 
Fatio first communicated his method of multiplying an equation by xy” to possibly 
put it into integrable form. Huygens in turn wrote Leibniz concerning Fatio’s method 
for solving the differential equations 


—2xydx + Ax7dy — y*dy =0O and — 3a ydx + 2xy"dx — 2x*ydy + a’xdy = 0. 


Observe that, after multiplying across by y~>, the first equation becomes the 
differential of —x?y~4 + 5 y~* = c. And the second equation, when multiplied by 
x‘, can be integrated to yield a2x~*y — x~*y? = c. Fatio later told Newton about 
his method, and Newton included it in his De Quadratura, giving credit to Fatio. 
The technique, as Newton explained and generalized it, was to multiply fi (x, y)x + 
to, y)y = 0 by xy” to get M(x,y)x + N(x,y)y = 0, where M and N were 
polynomials or even algebraic functions of x and y. The basic idea was to compute 
a J M(x, y) dx and choose w and v so that this quantity became equal to N(x, y). If 
this was possible, then { M(x, y) dx = c was the solution of the differential equation. 
Newton even extended this method to second, third- and higher-order differential 
equations. Regrettably, Newton did not include these results on differential equations 
in the published version of De Quadratura; they were rediscovered by Leibniz and 
the two elder Bernoullis. It should be noted that Newton’s acknowledgment!! of 
Fatio’s contribution was unusual; it showed the depth of their friendship at that time. 
Unfortunately, it appears that in 1693, this friendship was abruptly and emotionally 
terminated. !? 

In the tenth of his 1691-92 lectures on integral calculus to 1’Hopital,!* Johann 
Bernoulli gave an ingenious application of integrating factors to solve the sepa- 


rable equation axdy — ydx = 0. He multiplied the equation by ae to obtain 


yal yt . . : . a. . 
oO —d y- “dx = 0. Since the left-hand side was the differential of y*, integration 


yielded yr = c. Note that, after separation of variables in the original differential 
equation, one gets a logarithm on each side. But it seems that at that time Bernoulli 
found some difficulty in working with logarithms in an analytic setting and found the 
solution by a method avoiding the logarithm. It was only after an exchange of letters 
with Leibniz that Bernoulli understood logarithms; in fact, in 1697 he published a 
paper on exponentials and logarithms. !4 

In the 1690s, Leibniz and the Bernoullis also learned to handle first-order linear 
differential equations. In a 1695 paper, Jakob Bernoulli raised the question of how to 
solve the nonlinear equation ady = yp dx + by"q dx where p and q were functions 
of x and a, b were constants.!> In response, Leibniz as well as Johann Bernoulli 
observed!® that the equation could be linearized by the substitution v = y!~”. 


11 ibid. p. 79. 

12 Westfall (1980) pp. 538-539. 

13, Bernoulli (1742) vol. 3, pp. 385-558. 
'4 Bernoulli (1742) vol. 1, pp. 179-187. 
15. Bernoulli (1744) vol. 1, p. 663. 

16 Bernoulli and Leibniz (1745) p. 199. 
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Bernoulli found an interesting method, applicable to linear equations as well, for 
solving the equation by setting y = mz. This technique showed that in the case of 
linear equations, m could be chosen to be e~ pax the reciprocal of the integrating 
factor. Three decades later, in 1728, Euler wrote a paper, published in 1732, in which 
he solved the linear equation by making use of an integrating factor, making due 
reference to his teacher Bernoulli.!7 

The theory of linear differential equations with constant coefficients was developed 
much more slowly than one might expect. Recall that in 1728 Daniel Bernoulli 
solved linear difference equations with constant coefficients!® by substituting x” 
in the difference equation to obtain an algebraic equation in x, whose solutions 
X1,xX2,... determined the possible values of x. The general solution was then a 
linear combination cx} + cox +--+ of the special solutions. Yet it took Euler 
nearly a decade to perceive that he could solve the corresponding differential equation 
in a similar way. For more discussion on the alternating development of difference and 
differential equations, see Chapter 13. 

The search for a general solution for a linear differential equation with constant 
coefficients seems to have started with Daniel Bernoulli’s letter to Euler!? of May 4, 
1735, describing his work on the transverse vibration of a hanging elastic band fixed at 
one end to a wall. Bernoulli wrote that he found the equation for the curve of vibration 
to be nd*+y = ydx*, n a constant. He requested Euler’s help in solving the equation, 
noting that if p divided m, then the solutions of ad? y = ydx? were contained in 
those of nd” y = ydx". It followed, he observed, that the logarithm satisfied both 


his equation and nidd y = ydx?, but that it was not general enough for his purpose. 
Euler too was unable to solve the equation except as an infinite series. Commenting 
on this, C. Truesdell remarked, “These are two great mathematicians who have just 
shown themselves not fully familiar with the exponential function; we must recall that 
this is 1735!”70 

In Proposition XXIV, Theorem XIX of Book II of his Principia, Newton gave a full 
description of his geometric treatment of simple harmonic motion. In a 1728 paper on 
simple harmonic motion, Johann Bernoulli gave a more analytic treatment, solving the 
second-order differential equation ay = —y by reducing it to a first-order equation.”! 
Hermann did similar work in a 1716 paper in the Acta Eruditorum.”* These seem to be 
the earliest treatments of simple harmonic motion by the integration of the differential 
equation describing the motion. 

In a letter to Johann Bernoulli of May 5, 1739, Euler wrote that he had succeeded 
in solving the third-order equation a*dy? = ydx*, where dx was assumed constant.” 


Note that this meant that x was the independent variable. Euler gave the solution as 


!7 Bu. I-22 pp. 1-14. E 10 § 15. 

18 Bernoulli (1982-1996) vol. 2, pp. 49-64. 

19 Fuss (1968) vol. 2, pp. 419-423, especially p. 422. 
20 Truesdell (1960) p. 167. 

21 Bernoulli (1742) vol. II, p. 210. 

22 Hermann (1716). 

23 Ey. IVA-2, pp. 287-305, especially p. 302. 
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He gave no indication of how he found this, but he probably did not use a general 
method, because in his letter to Bernoulli of September 15, 1739,24 he wrote that he 
had recently found a general method for solving in finite terms the equation 


dy ddy 


! Lb 
ene a Pe 


He noted that the solution depended on the roots of the algebraic equation 


1—ap¢4 bp* cp° dp* ep> + etc. = 0. 


As an example, he explained that the solution of Daniel Bernoulli’s equation d+y = 
k+ydx* was determined by the algebraic equation 1 — k+ p+ = 0. Thus, the solution of 
the differential equation emerged as”> 


y= Ce-k + Dek +E sin (=) + F cos (=) : 
In a letter of January 19, 1740,26 Euler mentioned that he could also solve 


3d°y 
dx3 


d*y 
— | | 2 
O=y4 ee + bx 77) 


Ahan, (14.1) 


Johann Bernoulli replied to Euler in a letter of April 16, 1740,7’ that he too had solved 
(14.1). He reduced its order by multiplying it by x? and choosing p appropriately. 
He remarked that he had actually done this before 1700 and also wrote that he had 
found a special solution similar to that found by Euler for the equation with constant 
coefficients. However, he was puzzled as to how the imaginary roots could lead to sines 
and cosines. For about a year, he discussed this with Euler. Finally, Euler pointed out 
that the equation ddy + ydx* = 0 had the obvious solution y = 2 cos x, also taking 
the form y = ex¥-1 4 @-*V—T. Bernoulli ended his letter by asking whether it was 
possible to reduce the equation 


da 
yxx dx? + addy =0 Le., a +x7y=0 ; 
dx? 


to a first-order equation. Euler answered”® that by the substitution y = el <4 the 
equation reduced to 


xxdx + adz+azzdx =0 (14.2) 


24 ibid. p. 314. 

25 ibid. p. 315. 

26 ibid. p. 369. 

27 Fuss (1968) vol. 2, pp. 33-41. 
28 Enestrém (1905). 
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and noted that this was a particular case of the Riccati equation dy = yydx + ax” dx 
on which he had already written papers. 

Interestingly, in a paper on curves and their differential equations, published 46 
years earlier, Bernoulli stated an equation almost identical to (14.2) and wrote that 
he had not solved it; he noted that, of course, separation of variables would not 
work.”? His older brother Jakob made persistent efforts to solve the equation and 
finally succeeded in 1702. In a letter to Leibniz dated November 15, 1702,°° some 
of which was devoted to the relation of the sum )> = with integrals of the form 
f x! In(1 +x) dx, he mentioned in passing that he could solve dy = yy dx +.xx dx by 
reducing it to ddy = x*ydx? and then applying separation of variables. When Leibniz 
asked for details, Jakob provided them in a letter of October 3, 1703.3! Here given in 


modern notation, he defined a new function z by the equation y = — 1 a to reduce 
dy nee 
eS ae 
dx 4 
to the form 
d 27 2 
— + xz =0. 
dx 


He solved this second-order linear equation by an infinite series for z from which 
he obtained y as a quotient of two infinite series. After performing the division of one 
series by the other, his result was 

x? x ax" 13x} 


25035 ! ! bike Aa 
eae nee te ae ee ek ae oe ee ee 


Now Newton would have been satisfied with this infinite series solution, whereas 
Leibniz and the Bernoullis had a different general outlook. They strove to find 
solutions in finite form, using the known elementary functions. Perhaps this may 
explain why mathematicians in the Leibniz-Bernoulli school took scant note of Jakob 
Bernoulli’s new method for dealing with the Riccati equation. However, Euler himself 
later rediscovered this method and generalized it. 

Jacopo Riccati (1676-1754) studied law at the University of Padua, but was 
encouraged to pursue mathematics by Stephano degli Angeli, who had earlier taught 
James Gregory. Riccati became interested in the equation named after him upon 
studying Gabriele Manfredi’s treatise De Constructione Aequationum Differentialium 
in which he considered the equation 


nxxdx —nyydx +xxdy = xydx, 


29 Bernoulli (1742) vol. 1, pp. 123-125. 
30 Leibniz (1971) vol. 3/1, pp. 62-66. 
3! ibid. pp. 72-79. 
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a special case of what is now known as the generalized Riccati equation 


=P+Qy+Ry’, 


where P, Q, and R are functions of x. Around 1720, Riccati and others worked on the 
special case 


ax" dx + yydx = bdy, (14.4) 


and Riccati attempted a solution using separation of variables. Amusingly, Riccati cor- 
responded on this topic with all the then living Bernoulli mathematicians: Niklaus I, 
Johann, and his two sons Niklaus II and Daniel. The latter two determined by different 
methods a sequence of values of m for which the equation could be solved in finite 
terms. Riccati published his work in 1724° with a note by D. Bernoulli, who gave 
the announcement of the solution of the Riccati equation (14.4) as an anagram. About 
a year later, without reference to his anagram, D. Bernoulli published details of his 
solution. Briefly, his result was that the equation could be solved in terms of the 
logarithmic, exponential and algebraic functions when n = a, that is, when n 
was a number in the sequence 


4 4 8 8 12 12 16 16 


ee ge eT gg a gh 
His method was to show that the substitutions “ =u,y= —1 in the equation 
dy = ax" + by? produced another equation of the same form, but with n changed to 
ere Then again, the substitutions x = i 5 ag ee vu* also produced another 
equation of the same form, but with n changed to —n — 4. Now whenn = QO, 
the equation was the integrable a = a+ by’. It followed that when n = —4, the 
-4 4 


equation would still be integrable, and the same was true when n = —~j5 = —3. 
The result was the sequence given by Daniel Bernoulli. Note that when m — oo, 
n= at — —2; it turns out that when n = —2, the equation is still integrable. In 


that case, ay = 2 + by? and the substitution y = 2 produces the separable equation 


% = =a+v-+bv?, 
dx 
Euler published many papers on the Riccati equation. In the 1730s, he elucidated 
its relation to continued fractions,°? and solved it as a ratio of two infinite series** in 
the manner of Jakob Bernoulli. Euler’s method here also demonstrated that these series 
would be finite for those values of m defined by Daniel Bernoulli and his brother. In the 
1760s, Euler published a demonstration that the generalized Riccati equation could be 
transformed to a linear second-order equation, and conversely.*> The paper was read 


32. Riccati (1724); an English translation available through the online Euler archive. 
33 Bu. 1-14, pp. 187-216. E71, § 28. 

34 Eu. 1-22, pp. 19-35. E31. 

35 Bu. 1-22, pp. 403-418. E 284. 


14.1 Preliminary Remarks 335 


to the Berlin Academy in 1742. He also showed that if one particular solution, yo, of 
the generalized Riccati equation were known, then, by the substitution y = yo + i, 
Riccati’s equation could be reduced to the linear equation 


dv 
oe +(Q+2Ry)v+R=0. (14.5) 


On the other hand, if two solutions, yo and y, were known, then w = = satisfied 
the simpler equation 


ae = R(yo — y1). (14.6) 


Interestingly, in 1841, Joseph Liouville proved the converse of D. Bernoulli’s 
theorem on the Riccati equation.*° Liouville was very interested in the general 
problem of integration in finite terms, using the elementary and algebraic functions, a 
topic now seeing renewed interest in the area of symbolic integration. Liouville proved 
that if a = ax" + by” could be solved in finite terms, then n had to be one of the 
numbers determined by Bernoulli. 

In a 1753 paper, Euler solved nonhomogeneous linear equations by the technique 
of multiplying the equation by an appropriate function to reduce its order.*’ Later, in 
1762,>8 Lagrange found another method for reducing the order of such an equation, 
leading him to the concept of an adjoint, a label apparently first used in this context 
by Lazarus Fuchs about a century later. Briefly, Lagrange took the differential 
equation to be 


dy d*y 
aire nN L...=T, (14.7) 
where L,M,N,...,7 were functions of t. He then multiplied the equation by some 


function z(t) and integrated by parts. Since 


dy d 
Mz— dt = Mzy — —(M t 
i za zy ix z)y dt, 


d*y dy d d* 
—dt=N , t, 
[ue d ar Bey [ Saorova 


and so on, the original equation was transformed to 


dy 


N ee 
Ai Z+ 


d 
y (me aa © (N2)) + 


d we 
+f (u a t® | qa NY | o)ydr= f reat (14.8) 


36 Liouville (1841). 
37 Eu. 1-22, pp. 181-213. E 188. 
38 Lagrange (1867-1892) vol. I, pp. 471-478. 
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Lagrange then took z to be the function satisfying 


3 d uy + wy = (14.9) 
v6 dt Zz ae Zz =U. : 


This was the adjoint equation, and if z satisfied it, then the expression within 
parentheses in the integral on the left-hand side of (14.8) would vanish. The remaining 
equation would then be of order n — 1. In this way, the order of the equation 
(14.7) was reduced by one, and the process could be continued. When Lagrange 
applied this procedure to the adjoint equation (14.9) to reduce its order, he obtained 
the homogeneous part of the equation (14.7). Thus, he saw that the adjoint of a 
homogeneous equation was, in fact, that equation itself. Lagrange also discovered 
the general method of variation of parameters in order to obtain the solution of 
a nonhomogeneous equation, once the solution of the corresponding homogeneous 
equation was known. Lagrange did this work around 1775,°° but in 1739, Euler 


applied this same method to the special equation ay tky=xX. 

We have seen that in his 1743 paper, Euler introduced the concepts of general and 
particular solutions of linear equations.*° By choosing appropriate constants in the 
general solution, any particular solution could be obtained. Taylor in 1715 and Clairaut 
in 1734 found solutions for some special nonlinear equations, solutions not producible 
by choosing constants in the general solutions.*! Of one solution, Taylor remarked 
that it was singular, and so they were named. Euler also studied singular solutions; 
he found it paradoxical that they could not be obtained from the general solutions. 
He first encountered such a situation in the course of his study of mechanics in the 
1730s. In his paper of 1754,*7 he posed a number of geometric problems leading to 
singular solutions, commenting that the paradox of singular solutions was not a mere 
aberration of mechanics. 

It appears that the French mathematician Alexis Claude Clairaut (1713-1765) was 
the first to give a geometric interpretation of a singular solution; this appeared in his 
1734 paper on differential equations. He considered the equation 


d dy\? 
j= Gh (2) =(x+1)p— p’. 


Note here that the more general solution y = xy’ + f(y’) is now called Clairaut’s 
equation. Briefly describing his solution, we differentiate with respect to p and 
simplify the equation to get 


dp (x +1—2p) =0. 


The first factor gives p = c, a constant, so that y = c(x + 1) — c*. The second factor 


x+1 (x+1)? 
4 


gives p = *, so that y = is a solution. Now the envelope (or, in a plane, the 


39 ibid. vol. 4, pp. 5-108. 

40 Eu. 1-22, pp. 108-149. E 62. 

41 Taylor (1715) and Clairaut (1734). 
42 By, 1-22, pp. 214-236. E 236. 
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curve that is tangent to each one of a family of curves) of the family of straight lines 
y =c(x + 1) —c’ is found by first eliminating c from this equation and its derivative 
with respect to c, given by 0 = x + 1 — 2c. Thus, the envelope is given by y = Ge 


3 ‘ 4 : — (+l)? 
and in this case, the singular solution y = ~—y7— 


integral curves. 

D’Alembert, Euler, and Laplace also studied singular solutions. These efforts 
culminated in the general theory developed in the 1770s by Lagrange. Considering 
equations of the form 


is the envelope of the family of 


dy dy\" dy 
‘a X,Y, = = ay,(x,y) Sapaak +-+-+a,(x,y)— + ao(x,y) =0, 
dx dx dx 


he gave two approaches to the study of singular solutions. In the second of these 
methods, he differentiated the equation with respect to x to obtain the expression for 
the second derivative 


Lagrange asserted that at the points of the singular solutions, the numerator and the 
denominator both vanish. Hence, both the equations 


oF =0 and af af ay =0 
dy’ ax odydx 
had to be satisfied along the singular solutions. Note that these relations are true for 
Clairaut’s singular solution. In fact, the second equation is identically true in that case. 
In 1835, Cauchy was the first to treat the question of the existence of a solution of 
a differential equation.*? Earlier mathematicians had presented methods for solving 
the equation on the assumption that solutions existed. Cauchy worked with a system 
of first-order differential equations; given in its simplest form with only one equation, 
his result can be stated: Suppose f(x,y) is analytic in the variables x and y in the 
neighborhood of a point (xo, yo); then the equation a = f(x,y) has a unique analytic 
solution y(x) in a neighborhood of xg such that y(xo) = yo. Cauchy’s first proof of 
this theorem had gaps and in the 1840s, he published papers giving a more detailed 
exposition of the result. The French mathematicians C. Briot and J. Bouquet worked 
out a clearer and more complete presentation of this method, called the method of 
majorant.** For a brief description of this method, take x9 = yo = 0 and suppose 


fay) =o aux'y® for |x| < A, |yl < B. 


Also, let M be the maximum of | f (x, y)| in this region. Then the function 


43 Cauchy (1840-1841) vol. 1, pp. 327-384. 
44 Briot and Bouquet (1856). 
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M 


F(X,y)= 


is such that if 
F(x,y) = > Aux'y* 
then 
laxi| < | Axil. 


The differential equation a = F(x, y) can now be solved explicitly. First, it can 
be shown that the coefficients of the formal power series solution of a = f(x,y) are 
bounded by the coefficients of the explicit solution of dy = F (x,y), and this fact may 
then be used to show that the formal solution is an actual solution and is unique. In the 
1870s, Kovalevskaya showed that this method could be extended to a certain system 
of partial differential equations. The paper containing this result formed a part of her 
doctoral dissertation, supervised by Weierstrass. 


14.2 Leibniz: Equations and Series 


Leibniz’s approach, as contrasted with Newton’s, called for only occasional use of 
infinite series. Nevertheless, Leibniz discussed the connection between series and 
calculus; he derived series for elementary functions in several ways. In a paper of 
1693, referring to Mercator and Newton, he derived the logarithmic and exponential 
series. As early as 1674, while working his way toward his final conception of the 


calculus, Leibniz discovered*® the series for exp(x). He started with the series 
_ , ae 
y=xt a + 3r + ; 
and integrated to get 
x x2 x3 x4 
[ ydx = oy T 31 T Al T ‘os = VX. (14.10) 


Note that at the time Leibniz did this calculation, he was still using the “omn.” 
notation for the integral. By taking the differential of both sides, he obtained 


d 
ydx =dy—dx or Gia, 
Vee 


Now Leibniz knew from the work of N. Mercator that this last equation implied 


x = In(y + 1) so that 


45 von Kowalevsky (1875). 
46 See Scriba (1964). 
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x2 x3 


exp(x) =I +y=1+4+x4 y 31 eit 


Although Leibniz did not explicitly write the last equation, he clearly understood the 
relation of the series to the logarithm. 

Leibniz wrote a letter to Huygens,*” dated September 4/14, 1694, explaining his 
new calculus techniques. Here note that Leibniz gave two dates because of the ten-day 
difference between the Julian and Gregorian calendars during that period. Huygens, 
at the close of an outstanding scientific career, was still eager to learn of the new 
advances in mathematics. He had already glimpsed the power of calculus from the 
work of Leibniz and the two Bernoullis on the catenary problem. Leibniz’s first 
example for Huygens was the derivation of the infinite series for cosx from its 
differential equation. He started by observing that if y was the arc of a circle of radius 
a, and x denoted cos y, then 


dx 
y =af —————— (14.11) 
are. 
and 
adx 
dy = Tn (14.12) 
or 


va —x?2dy =adx. (14.13) 


Note here that the integral (14.11) was Leibniz’s definition of arcsine or arccosine, 
as given in his 1686 paper.*® We follow Leibniz almost word for word. He set 


v = Va? — x? (14.14) 
so that 
vdy =adx. (14.15) 
By differentiating this equation he found 
uddy + dvdy = addx. (14.16) 


Leibniz then assumed that the arcs y increased uniformly, that is, dy was a constant 
or ddy was zero. Recall that in our terms this meant that he was taking y as the 
independent variable and x and v as functions of y. Thus, he had ddy = 0, and 
equation (14.16) reduced to 


dudy =addx. (14.17) 


47 Leibniz (1971) vol. 2, pp. 195-196. 
48 ibid. vol. 3, pp. 226-235. 
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To eliminate v, he observed that from (14.14), v2 = a* — x”, and therefore 
vdv = —x dx or 
d 
Ce ae (14.18) 
v 
By (14.15) and (14.18), 
d 
dpa (14.19) 
a 


then, by (14.17) and (14.19), he arrived at the required differential equation, 
—x dydy =a’ ddx. (14.20) 


In order to derive the series for a cos (~) from this equation, Leibniz set 


He substituted this in (14.20) and equated coefficients. After a detailed calculation, he 
arrived at 
1 1 9 1 4 1 


sands ai eee es oe aT be ae eee ee es 


° + etc. 


14.3 Newton on Separation of Variables 


To get an idea of Newton’s thinking and notation, we consider one simple example on 
separation of variables from De Quadratura.*? He began with the equation —axxy* = 
a*y + a>xy. He separated the variables 


aa : a, 
-a)x=—> 
at+x yp 


aa a 
a+x yo 


and then integrated to get 


In Newton’s notation, the square box denoted an integral. Sometimes, he replaced 
the box by the letter Q, denoting the Latin expression for area. He had no special 
notation for the logarithm and merely referred to it as the area of the hyperbola with 
ordinate ee and abscissa x. He then rewrote the last equation, omitting the constant 
term, as an infinite series 


49 Newton (1967-1981) vol. 7, p. 73. 
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“Or again, if c* is any given quantity, then is 


a 1 1x3 1 x4 
= } XxX } epee 
y 2 3a 4aa 


the equation to be found.” 

We note in passing that in a 1691 letter to Huygens,” Leibniz discussed an 
example of a separable equation Apparently, Johann Bernoulli first used the expression 
separation of variables in a May 1694 letter to Leibniz. He wrote,>! “Ut in aequa- 
tionibus differentialibus indeterminatae x cum suis differentialibus dx separentur ab 
indeterminatis y and dy.” 

Newton illustrated Fatio’s method of integrating factors,>* starting with 


Oxxy — 18ky? — 18xyyp + 5x77 =0. (14.21) 
Applying Fatio’s technique, he multiplied the equation by x“ y” to get 
gxxttlyrtl —_ 18xx¥ yt? —, 18xH#tlyytly ae 5xht2 yy —0. 
He then integrated with respect to x the terms forming the coefficient of x to obtain 


9 18 
ete v+1l _ gl hes ag, 


w+2 ut+l 
The fluxion of this expression would then reproduce the terms containing x, but the 
terms with y would be given by 
9U FD ut vy 1S 2) aa v+1- 
w+2 ut+1 


. : : 9vt+l) _ v+2 _ 7. 
For this to agree with (14.21), Newton required that a 5 and ve" 1; or that 


w= 3 and vy = 3. Hence, with g a constant, the solution of the fluxional equation was 


36 
2x2 y? — Sx?y? +¢g=0. 


14.4 Johann Bernoulli’s Solution of a First-Order Equation 


We have seen that quite early in the study of differential equations, mathematicians 
noticed that separation of variables and integrating factors were applicable in many 
special situations. At the same time, they observed that there were simple first-order 
equations to which these methods could not be directly applied. Jakob Bernoulli, in 


50 Gerhardt (1898) p. 680. 
5! Bernoulli an Leibniz (1745) pp. 5-9, especially p. 7. 
52 Newton (1967-1981) vol. 7, pp. 78-81. See also Whiteside’s footnote concerning Fatio on pp. 78-79. 


342 Differential Equations 


the November 1695 issue of the Acta Eruditorum, posed the problem of solving the 
differential equation 


ady = ypdx + by"q dx, (14.22) 


where p and qg were functions of x and a was a constant. In 1696, Leibniz noted that 
this problem could be reduced to a linear equation, though he did not give details. 
A year later, Johann Bernoulli wrote** y = yma would reduce the equation to the 
linear equation 


adv = vupdx + bq dx. (14.23) 


However, his alternative method was to take y to be a product of two new variables 
so that the extra variable could be appropriately chosen. Thus, he set y = mz so that 
dy = mdz + zdm and equation (14.22) would take the form 


azdm + am dz = mzp dx + bm"z"q dx. (14.24) 
He then set am dz = mzpdx or 
adz:z= pdx. (14.25) 


Hence, z could be found in terms of x. Bernoulli denoted this function by &. In 
the notation developed by Euler in the late 1720s, one would write § = cat PAX op 


ea! P4® Note that Euler changed the c to e in the 1730s. Equation (14.24) was then 
reduced to 


azdm = bm"z"q dx or a& dm = bm"E"q dx 


or 
am~" dm = bé"~'q dx. (14.26) 
After integration, he had 
4 mtu | e194 14.27 
aa / g"q dx. (14.27) 


In 1728, Euler made use of the integrating factor suggested by Bernoulli’s method 
to solve nonhomogeneous linear equations of the first order.>> The specific equation 
he faced was 


22dt dt 


dz = 
Cl end apes 


53 Bernoulli (1744) vol. 1, p. 663. 

54 ibid. pp. 174-179. Also see his letter to Leibniz in which he gave exactly the same proofs, except that he 
wrote € as X. 

55 Bu. I-22, pp. 1-14, especially pp. 10-12. E 10. 
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He found the integrating factor by taking the exponential of the integral of the 


coefficient of z. So he multiplied the equation by ct at = (t—1)* and then integrated 
to obtain the solution 


In the general situation, given by Euler in a presentation of 1750,*° this would require 
the solution of ay + py = q. Multiplying by e/ 4, and using e¢ instead of c, Euler 
wrote 


F (yel Pav) = gel pax 


dx 


and hence 


y=e / pas ge! P4 dx, 


14.5 Euler on General Linear Equations with Constant Coefficients 


Euler seems to have been the first mathematician to apply linear superposition of 
special solutions to obtain the general solution of a linear differential equation. One 
may contrast this with the method employed by Johann Bernoulli>’ to solve the 
equation for simple harmonic motion, 


Note that we have slightly modernized Bernoulli’s notation; he wrote the equation as 
nnaadadx : dy” =a-Xx. 


Note also that dy” stands for (dy)*. Bernoulli multiplied the equation by dx (or *) 
and integrated to get 


Here observe that Bernoulli used L dxddx = 5(dx)?; next, he had 


+C, 


/ adx d : 
n | ——= y=y, or y =naarcsin 
V2ax — x? 


56 Ey. 1-22 pp. 181-213. E 188, § 7. 
57 Bernoulli (1742) vol. 3, p. 210. 
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Bernoulli presented his solution in this form with no mention of linear superposition 
of particular sine and cosine solutions. In a paper presented to the Berlin Academy in 
1742 and published in 1743,>8 Euler considered an equation of the form 


dy d?y d?y d"y 
0= Ay4 tC + D tes: +N 14.28 
4 dx dx? dx3 dx" ( ) 
and observed that if y = u was a solution, then so was y = au, with a a constant. 
Moreover, if n particular solutions y = u, y = v,... could be found, then the general 
solution would be y = au + Bu-+.---. To obtain these special solutions, he took 
y= e/ pdx. Euler then wrote out the derivatives of y: 
y= el pax 
dy 
dx ef Pa p 
dy os dp 
a = ef pax — 
dx2 (op a dx 
d*y i dp °d*p 
Be Se pdx Bip Rees pane 
dx © (» Pax a dx? 
dty f dp d*p dp 
—7 = el PAX! pt +6 +4 ! ; 
dx4 © (» PP an a? a ) 


When these values were substituted in the differential equation, the expression 
would be simplest when p was a constant, for in that case the derivatives of p would 
vanish; p would then satisfy the algebraic equation 


O=A+Bz+Czz+ D2+-+-+N2". 


If z = } was a root of this equation, then s — tz = 0 satisfied the nth-degree algebraic 


equation, and the solution y = ae? of the differential equation 


dy 
—t— =0 
Bs dx 
also satisfied the nth-order differential equation. When there was a repeated root, so 
that (s — tz)* = ss — 2stz+ ttzz = 0 was a factor of the nth-degree polynomial, then 
there would be a corresponding factor 
dy 


SSY aot | nS = 


0 


of (14.28). To solve this second-order equation, Euler set y = e’ u to find that u 
satisfied 


ddu 7 
dx2 


58 Eu. 1-22, pp. 108-149. E 62. 


14.6 Euler: Nonhomogeneous Equations 345 


Thus, u = ax + p and pe =er (ax + B). Similarly, when there were three repeated 
roots, then y = e T (ax? + Bx +y).In the general case where a root was repeated k 
times, the solution would turn out to be e times a general polynomial of degree k — 1. 
Euler then considered complex roots, making use of a result he had published three 
years earlier, that the general solution of the equation 


d*y 
—tky=0 
ae 4 
was 
Acos /kx + B sin Vkx. 


So he supposed that a — bz + cz” = 0 was a quadratic factor of the a alors this 


—b+/ b?—4ac 
2c 


wrote the quadratic factor as a — 2z./ac cos ¢ + cz” and the corresponding differential 
equation as 


yielded z = . Then he assumed PEE < land set cos@ = 5 Tie Euler 


d*y 


0 =ay— 2Jaccos o> = +e 
“dx 


The substitution y = eV@“* $y reduced the equation to 
2a 
oe Gs (ac? cos” @ — 2ac cos? ¢ + a)u = 0; 


: : 2 
note that this was of the required form a +ku = 0. Euler then proceeded to the case 
where there were repeated complex roots. 


14.6 Euler: Nonhomogeneous Equations 


In his 1750 paper, published in 1753,>° Euler described two methods for solving the 
nonhomogeneous equation 
dy d*y d"y 


Aoy+A tA bees tA ; 14.29 
eae a ree non ( ) 


In section 6 of this paper, he assumed X to be a polynomial and found a particular 
solution by taking y to be a polynomial of the same degree as X, then substituting 
in the differential equation to find the coefficients. He then arrived at the general solu- 
tion by adding the general solution of the homogeneous equation to the particular 
solution. 


59 Eu. 1-22 pp. 181-213. E 188. 
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In sections 7 through 22, Euler described a second method of solution; in section 23, 
he presented four examples. Now in section 7, he considered the first-order equation 


dy 
X=Aoy+Ai—, 
dx 


to solve which he first multiplied by e®* dx to obtain 
eX dx = Age** y dx + Aye dy. 
To determine a he assumed 
/ eX dx = Ae" y; 
he then took the differential of equation (14.32) to obtain 
eX dx = Ajae**ydx + Aje™ dy. 


By equating (14.31) and (14.33), he got 


a= 
Al 


or Aja — Ag = 0. 


and using (14.32) Euler had the solution of (14.30): 


ee 
y= ferx dx. 
Ay 


In section 8, Euler considered the second-order equation 


d’y 


X= Aoy+ Al 252? 


d 
“+A 
dx 
that he multiplied by e®*dx to obtain 
d’y 
eX dx = Age ydx + Aje™ dy + Are™ aes 
x 


Euler next assumed 


d 
ferx dx =e (20> + By *) 
dx 


and took its differential to get 


2 


d 
eX dx = e™ (aBoras + Body + By z + a By ay). 


dx 
Equating (14.37) and (14.39) yielded 


Bi =A, Bo=A1—a@A2, Ap =aA,—a7 Ap. 


(14.30) 


(14.31) 


(14.32) 


(14.33) 


(14.34) 


(14.35) 


(14.36) 


(14.37) 


(14.38) 


(14.39) 


(14.40) 
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Thus, @ was a solution of the second-degree equation 
Aza” — Aja + Ag = 0, (14.41) 
so that Euler had next to solve the equation of order one arising from (14.38), 


d 
ene i: e*X dx = Boy + Bi a (14.42) 
Xx 


Using (14.35), the solution of (14.42) would be given by 


eae 
y= ? eee (/ "x dx) dx, (14.43) 
1 


where, from (14.34) and (14.40), 


Bo _ A1—@A2__ Al Al 


a Bi Ae A> A: 


Clearly then, 6 must be a root of the quadratic equation (14.41). Here Euler 
assumed that the two roots a and 6 were distinct. He next applied integration by parts 
to equation (14.43) to obtain 


- (B—a) 
a e Bx [ e(B-a)x ferxar- 1 eee. 
A2 B-a B-a 


or 


ex e Bx 
y= aoe | et xae af a | et xas (14.44) 
A2(B — @) A2(a — B) 


as the solution of the second-order equation (14.36). Note that if P(x) = Agx? + 
Aix + Ao, then by (14.40) 


P(x) 
xXx+a 


= Aox + Ay — Ara = Bix + Bo. (14.45) 
Moreover, taking P’(x) as the derivative of P(x), (14.44) may be written: 
—ax —Bx 
y= Boe | etxart ao | etx ae. 
P'(—a@) PCP) 
Euler next considered the nth-order equation 


Cay Ae ae A ee 
= A0Y7 Die ee ee aa 


(14.46) 


first working with the case in which the nth-degree polynomial 


P(x) = Ao + Aix t+ + Anx” (14.47) 
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had n distinct roots —a1, — @2,..., — Q». He gave the solution of (14.46) as 
eT ux e742 eT an 
y= —— eX dx + oo f ee xds pe. eo" X dx, 
P'(—@1) P'(—a@2) P!(—Qy) 
(14.48) 


although he did not prove it in general. Rather, he essentially worked out the case for 
an equation of order 3, noticed the pattern in the solutions for n = 1,2,3 and then 
stated the general case. We here present a general proof, whose interesting feature 
is that it requires the application of Lagrange’s interpolation formula. This is a proof 
that Euler could have given, and he had himself discovered a formula from which 
Lagrange’s interpolation formula followed. See his paper, “De eximio usu methodi 
interpolationum in serierum doctrina.”™ 
Following Euler, we reduce the order of equation (14.46) by taking 


d a 
ea f ex dx = Boy By y fee. Bh y (14.49) 


—1 >) 
dx dx"! 


where —a is a root of the polynomial (14.47). By taking the derivative of 


n—1 
dy } eee t B, ¢ +) ’ 
XxX 


ax 
Peal (20> + By 7 i ae 


and, following the procedure by which he obtained (14.45) from (14.38), Euler 
showed that 


= Bot Bux +--+ + By_jx""! = O(x). (14.50) 


Assuming (14.48) to be true up to n — | and with 
Y=e* / eX dx, (14.51) 


we have the general solution of (14.49) as 
e702 eo enx 
Sef dee eon dx. (14.52) 
Q"(—a2) Q'(—an) 


Observe that 


P(x) = (« + a1) Q(x) 


and hence 


P'(x) = Q(x) + (« +1) Q(x). 


60 Bu. 1-15 pp. 435-497. E 555. 
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Next note that since —a;, j = 2,...n, are roots of Q(x) we may conclude that 


P'(—aj) = (a1 — aj) Q'(—«aj). (14.53) 


With Y as given by (14.51), consider the general term on the right-hand side of 
(14.52): 


e ix ; 
FTE (14.54) 
Q'(—a;) 
Integration by parts then shows that (14.54) is equal to 
eT tix ela j—a)x 1 
—— f ex dx — peowx dx |. (14.55) 
Q'(—a;) ay aj; — ay 
By using (14.53), we may rewrite (14.55) as 
e 1% 
Se ee Xa + Bian) — | e%i* X dx. (14.56) 
P’(—a;) aj) 


Substituting (14.56) for the right-hand side of (14.52), we can arrive at (14.48), 
providing that we can verify that 


n 


SS = = a (14.57) 
2 P(-aj) Pion) 


We can prove (14.57) by applying the Waring—Lagrange interpolation formula 
(9.27), with T a polynomial of degree < n — 1: 


n 


: Pox) | 
T(x) - 2 P'(—aj)(x + @;) T( aj). 


Taking T(x) = 1, we obtain 


P(x) 
Pe G+ aj)’ 


j=l 


in which we equate the coefficient of x”~! on both sides to arrive at 


n 


i. 


a PCa - aj)’ 
concluding our proof of (14.57). 

In this same paper, Euler also considered the case for which roots, real or complex, 
were repeated. He first supposed a real root a; occurred exactly twice, with a} = a2. 
He set 
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P(x) = (x +.a1)° O(x) 
and took the derivative twice to find that 
1 
5 P(e) = Q(—a1) = (a3 — a1) (a4 — a1) +++ (@_ — OY). (14.58) 


He then let a2 = a +, where w was an infinitesimal. Then 


P'(—a) =@Q(-a) and P'(—a2) = —w O(—a)), 


and the first two terms on the right-hand side of (14.48) would take the form 


eaux eo (1 +a)x 
se LS Se OE (14.59) 
w O(—a) w O(—a) 
Euler observed that 
e Mte)x — e-%I%(] _ wx) since e °* = (1 — ox), (14.60) 


where he neglected the second and higher-order terms in w. Similarly, 
eltita)x — er] + Wx). (14.61) 


Substituting (14.60) and (14.61) into (14.59), simplifying the expression, and 
neglecting the term involving w, Euler arrived at 


Sos (« f etxar— feexxar) = anes (_[ xas ar. 
Q(—a) O(—a) 


(14.62) 


Using a similar approach, Euler then considered the case in which there were three 
or more repeated roots, for the real as well as the complex case. 


14.7 Lagrange’s Use of the Adjoint 


In 1761-62, Lagrange illustrated how to use the knowledge of the solutions of a 
homogeneous equation of second-order to solve the corresponding nonhomogeneous 
equation.°! He supposed that y; and y were (independent) solutions of the homoge- 
neous part of the second-order equation 


Ly+M +t N = 1; (14.63) 


61 Lagrange (1867-1892) vol. 1, pp. 471-478. 
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By multiplying by z and applying integration by parts, the adjoint equation was 
obtained: 


dNz dy d(Mz) | d?(Nz) if 
M L =/T : 
r( rr ) dt ne I( a dt dt? ae ee 


(14.64) 
He then set 
d(Mz) | d?(Nz) 
L = O, 
: dt dt? 
so that 
d(Nz) dy / 
M Nz= | Tzdt. 14.65 
. ( dt ) dt : : ( ) 


Multiplying this adjoint equation by a function y and applying the same integration 
by parts procedure, Lagrange arrived at 


d(Nz)\ _ dy | dy d’y 
Mz N Ly+M +t N dt = tant. 
r( : dt ) di * dt at )* Rae 


When y = y; or y = yy, the integral vanished, and he got the two relations 


dN dy, dz 
M N Ny = ¢4, 14.66 
:| ( ae 2 qa ( ) 
dN dy2 dz 
M = 14.67 
:|( 7) v2] Free. C2, (14.67) 


where c, and cz were constants. Lagrange solved for z to obtain 


ass c1y2 — 21 (14.68) 


dy dy. \° 
N (yo — y $2) 


It may be helpful to note that the denominator here is N times the Wronskian of 
y, and y2. Lagrange then took cj = 0 and then cz = 0, denoting the corresponding 
values of z by z; and zz, respectively. When these were substituted in (14.65), he 


obtained 
d(Nz1) dy / 
M Nz, = | Tz, dt, 
»( Z1 dt dt Z1 Z1 


d(Nz2) dy i 
M Nza= Tz dt. 
y ( z2 a ) aN Z2 
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He solved for y and thereby arrived at a solution for (14.63): 


= zo f Tz dt —z f Tz. dt 


d d 
N (212 - 224) 


Lagrange then considered the case where just one solution y; was known so that he 
had only equation (14.66). Now (14.66) was a linear first-order equation in z solvable 
by Euler’s integrating factor method, so that 


-f{[ at 
patel ye a ee ; 
N yj 


Again, by taking c; = 0 and then c = 0, he obtained two values, z; and 22, of z. 
Thus, Lagrange taught us that when two solutions of the homogeneous equation are 
known, the solution of the nonhomogeneous equation may be obtained by solving 
two sets of linear equations. But when only one solution is known, one must solve 
a first-order equation and a pair of linear equations. Lagrange pointed out that when 
L,M, and N were constants, then the homogeneous equation could easily be solved 
by Euler’s method, and hence the nonhomogeneous equation could also be solved in 
general for this case. In the case for which L, M, and N were constants and k; and 
ky were solutions of L + Mk + Nk* = 0, (ky + kz), Lagrange gave the solution of 
(14.63) by the formula 


= eft [ Te“ dt — eM! f Tet dt 
a N(k2 — ki) 


14.8 Jakob Bernoulli and Riccati’s Equation 


In his 1703 letter to Leibniz,°? Bernoulli gave the derivation for the solution of 
Riccati’s equation, dy = yydx + xxdx. Part of the problem here was to reduce the 
equation to a separable one. Bernoulli used an interesting substitution to accomplish 
this, ending up with a second-order instead of a first-order equation. In order to 


solve the second order equation, he had to use infinite series. He began by setting 


y = -—dz: zdx (y = -14:), so that by the quotient rule for differentials, the 


differential equation took the form 
dxdz* — zdxddz : zzdx* = dy = yydx + xxdx = dz : zzdx + xxdx; 
this simplified to 


~zdxddz = xxzzdx°, 


62 Leibniz (1971) vol. 3, part 2, pp. 74-78. 
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a separable equation expressible as 


—ddz:z=xxdx’. (14.69) 
Now this was the second-order equation —t as = x’, Bernoulli observed that if 


you had an equation —z°ddz = x’dx? and you sought a solution z = ax’”, then by 
substituting this in the equation you would find that m = (v+2) : (e+1). However, in 
(14.69), e = —1 and therefore no solution of the form ax” would be possible. He then 
drew an analogy with the first-order equation dz : z = x”dx, for which no algebraic 
solution was possible when v = —1. So Bernoulli concluded that no algebraic solution 
was possible for (14.69) and that he must take recourse in infinite series. He obtained 
the series solution as 


at x4 | x8 ye | 416 
3-4 5 3-4-7-8  3-4-7-8-11-12 | 3-4-7-8-11- 12-15-16 
Since y = —1&, he could write his solution as a ratio of two infinite series. By 


dividing the series for —% by the series for z, he obtained the first few terms of the 
series for y given in (14.3). 


14.9 Riccati’s Equation 


We have seen that in the 1730s, Euler wrote on Riccati’s equation, and returned to 
the topic sometime around 1760, then composing an important paper on first-order 
differential equations, published in 1763.% In that paper, he explained how to obtain 
the general solution of the Riccati equation if one particular solution were known. His 
method was to use the known solution to reduce the Riccati equation to a first-order 
linear differential equation. He supposed v to be a solution of the equation 


dy + Pydx + Qyydx + Rdx =0, (14.70) 


and observed that y = v + i reduced this equation to 


dz Pdx 2Qvdx QOdx 


Lz z z Zz 


=0 
or 
dz —(P+2Qv)zdx — Odx = 0. (14.71) 


He then noted that § = e~ /(? +29») 4* was an integrating factor. Hence his solution 
of (14.71) was Sz — f QS dx = Constant. Later in the paper, Euler considered the 
particular case of (14.70), discussed by Riccati and the Bernoullis: 


dy + yydx = ax" dx. (14.72) 


63 By. 1-22 pp. 334-394. E 269. 


354 Differential Equations 


Euler could find a special solution of this equation, by use of which he could determine 


the general solution by the already described method. He set a = cc, m = —4n and 
1 dz 
jaep tae 8 (14.73) 
z dx 


so that (14.72) was converted to the linear second-order equation 


ddz 2c dz 2nc 


Foto a ee (14.74) 


He then solved this equation as a series: 
z= Ax” + Bx") 4 Cx? 4 Dx 3 4 Ex 4 + ete. 
After substituting this in (14.74), he found 


_ —a(n—A _ —(3n — 1)Gn—2)B = —(5n — 2)(5n — 3)C f 
=GOn-e °° A@iwie ~ °° = bOn— pe 


Euler did not write the general case but if we let A, denote the coefficient of 


Ck Dank starting at k = 0, then the recurrence relation would be 
2k — 1 k+1)(2k —1)n—-—k 
Ae jn MC iy a Be a C5 
2k(2n — 1)c 


Note that if for some k, A, = 0, then A, = 0 forn =k+1,k+2,.... In this case, 
the series reduced to a polynomial and from (14.75) one could determine the general 
condition to ben = fo, orn = sho: For these values, the solution could be written 
in finite form. 


14.10 Singular Solutions 


In his 1715 book, Methodus Incrementorum, Brook Taylor presented some techniques 
for solving differential equations.™ In proposition VIII, he explained that solutions in 
finite form might be found if the equation could be suitably transformed. In describing 
one method of doing this, he considered the differential equation 


Ax? — 4x? = (14 2°)?x? (14.76) 


and found a singular solution. He set x = v’ y”, where 6 and y were parameters to be 
chosen appropriately later on, so that 


x= Ody +yyv)ve ty’ t, (14.77) 


64 Taylor (1715) pp. 26-27. For an English translation, see Fiegenbaum (1981). 
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He substituted these values of x and x in (14.76) to obtain 
4y% y3t — Au? y?°” = (1+ 2°)? @by + yyu)?ve yr, (14.78) 


Taylor then chose v = 1 + z? and assumed that z was flowing uniformly, that is 
z = 1so that v = 2z. Substituting these values in (14.78), he arrived at 


4u? y+? — dy? = (6zy + ypu)’. (14.79) 


Taylor then took y = —2 to eliminate y in the first term on the left-hand side to obtain 


v? — y? = (6zy — pv)? 


or 
v? = (6727 + ly? — 26zvyy + v’y?. (14.80) 


At this point, he set 9 = 1 so that 67z7 +1 = z* +1 = uv; then dividing by v, 
equation (14.80) reduced to 1 = y* — 2zyj + vyy. Taking fluxions, he found 


0 = 2yy — 2yy — 2zyp — 2zyy + DVy + 2vyVy. 
Since 
v = 2, 
he had 
—2zyy + 2vy =0. (14.81) 
Thus, (14.81) implied that either y = 0 or —2zy + 2vy = 0. Then the latter gave him 
—vy+2vy=0 or y? =v. (14.82) 


Now since he had taken 9 = 1 and y = —2, he got a solution, 


x=viy’ =vy 7? =1, 


where the last relation followed from (14.82). At this point, he remarked that x = 1 
was “a certain singular solution of the problem.” For the equation y = 0, he picked 
the initial values in such a way as to obtain as solution 


y=at+vl1—az?z, 


and hence 
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Observe that the solution x = 1 is not obtained from the general (14.83) for any 
value of a. 

Euler discussed singular solutions in a 1754 paper on paradoxes in integral 
calculus. He there noted the paradoxical fact was that there were differential 
equations easier to solve by differentiating than by integrating. He wrote that he had 
encountered such equations in his work on mechanics but that his purpose was to 
explain that there were easily stated geometric problems from which similar types 
of equations could arise. Euler started by presenting the problem of finding a curve 
such that the length of the perpendicular from a given point to any tangent to the 
curve was a constant. By using similar triangles, he found that the differential equation 


would be 
ydx —xdy =a,/dx* + dy”, (14.84) 


where a denoted a constant. After squaring and solving for dy he obtained 


(a* — x’) dy +xydx =adx,/x?2 + y*—a’*. (14.85) 


He substituted y = uvVa* — x? to transform the equation into the separable 
equation 


du adx 
= . 14.86 
el _ I a2 _ x2 ( ) 
After integration he obtained 
1 2 
hota Poe ee 
2 a-—x 
where n was a constant. He simplified this to 
n ja+x 1 fa-x 
u= t 
2Va-—x 2nVa+x 
or 
n 1 
y =uva* — x2 = -(a+x)4 (a — x). (14.87) 
2 2n 


—x 


shows that u = 1 is also a solution because both sides vanish. When Euler used this 
solution in the first equation in (14.87), he got y = Va? — x2, or 


65 Bu. 1-22, pp. 214-236. E 236. 
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ar oe an =a’. (14.88) 


Thus, the solution of (14.85) turned out to be a family of straight lines (14.87) as 
well as the circle (14.88). 

In the same paper, Euler next set out to show that he could derive these solutions 
by differentiation. Now note that this paper was written before his book on differential 
calculus in which he explained how higher differentials could be completely replaced 
by higher differential coefficients. So he explained that he would assume that dy = 
pdx, with p a differential coefficient, to remove difficulties associated with further 
differentiation. His equation (14.84) then became 


y= px+a/1+ p*. (14.89) 
Euler then differentiated (instead of integrated) this equation to get 


ap dp 


J1+ p? 


dy = pdx +xdp+ 


or 


d 
(24 aptee ees (14.90) 


1+ p? 


or 


ap 
x= -———_.. 
Jive 
Hence by (14.89), y = Tie By eliminating p, he obtained the solution 
x? + y* = a? and noted that he could also find the family of straight lines by this 
method. For that purpose, he observed that (14.90) also had the solution dp = 0. This 
implied p = constant = n, and so by (14.89) he obtained y = nx + aVv1+n7?, the 
required system of straight lines. 
Euler then remarked that equation (14.84) could be modified in such a way that 
the new equation could be solved more easily by the second method than the first. He 
considered the equation 


ydx — xdy =a(dx3 + dy3)3, (14.91) 


ory = pxta+ py3. As solutions, he found a sixth-order curve 


y® 2x73 1° 2a7y? + 2a°x? +a° = 0 (14.92) 


and the family of straight lines 


y=nx+ad+ n>)3, (14.93) 
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Euler gave three other geometric examples leading to differential equations with 
singular solutions. One of them yielded 


ydx — (x — b)dy = (@ — x?)dx? + a2dy?, (14.94) 


an equation difficult to integrate. He solved it by differentiation to obtain the solutions: 
the ellipse 

(x-b?  y 
os 


1, 
-, 


and the family of straight lines 


y = —n(b—x) + Va2(1 +n?) — b?. 


Later in the paper, Euler remarked that he found it strange that integration, which 
introduced arbitrary constants, did not produce the general solution, while differentia- 
tion did. 


14.11 Mukhopadhyay on Monge’s Equation 


In his letter to Huygens of September 4/14, 1694, Leibniz noted that the differential 
equation for the circle x* + y* = a? could be expressed as dy = se Now start with 
the equation for a general circle, 


(x-—a’ +(y—byY =’, 


or 


x? + y? = 2ax + 2by +c? — a? — b?. (14.95) 


To obtain the differential equation, we temporarily let p,q,r denote the first three 
derivatives of y with respect to x; differentiation of the last equation gives x + yp = 
a+bp and hence 1+ p?+yq = bq. Therefore, 1+ p? = (b—y)q and x—a = (b—y)p; 
then, after a short calculation, 


(1+ p?)2 
CS 
q 


(14.96) 


In his book on differential equations, Boole remarked®’ that since the right hand 
side of (14.96) was the expression for the radius of curvature, this equation was the 
differential equation for a circle of radius c. To obtain the equation for a circle of 
arbitrary radius, take the derivative of (14.96) to get 


66 [Leibniz (1971) vol. 2, pp. 195-196. 
67 Boole (1877) p. 20. 


14.11 Mukhopadhyay on Monge’s Equation 359 


3pq? —r(1+ p’) =0. (14.97) 


George Salmon (1819-1904) offered an interesting geometric interpretation of this 
equation®® in his book Higher Plane Curves, first published in 1852, also in later 
editions. Salmon defined “aberrancy of a curve” where the curve was y = f(x). Let 
P be a point on the curve and V the midpoint of a chord AB drawn parallel to the 
tangent at P. Let 5 denote the limit of the angles made by the normal at P with the 
line PV as A and B tend to P. Salmon called 5 the aberrancy because 6 = 0 for a 
circle. He noted that 


(i+ p*)r 


a? (14.98) 


tand = p— 
Thus, the geometric meaning of (14.97) was that the aberrancy vanished at any 
point of any circle. 
Boole also stated the differential equation of a general conic 


ax? 4 2nxy 4 by” t2ex+2fyteco=0: 


d?y\" dy dy d3y d4y d3y\° 
9 45 + 40 = 0. 14.99 
(3) dx dx? dx3 dx4 (3) ( ) 


This differential equation was published in 1810 by the French geometer Gaspard 
Monge (1746-1818). Boole remarked on this equation, “But here our powers of 
geometrical interpretation fail, and results such as this can scarcely be otherwise useful 
than as a registry of integrable forms.” 

The Indian mathematician Asutosh Mukhopadhyay (also Mookerjee) (1864-1924) 
published an 1889 paper”? in the Journal of the Asiatic Society of Bengal, showing 
an interesting geometric interpretation of Monge’s equation (14.99). Concerning this 
result, much appreciated by some British mathematicians, Edwards wrote in the 1892 
second edition of his treatise on differential calculus:”! 


A remarkable interpretation which calls for notice has, however, been recently offered by 
Mr. A. Mukhopadhyay, who has observed that the expression for the radius of curvature of the 
locus of the centre of the conic of five pointic [sic] contact with any curve (called the centre of 
aberrancy) contains as a factor the left-hand member of Monge’s equation, and this differential 
equation therefore expresses that the “radius of curvature of the ‘curve of aberrancy’ vanishes for 
any point of any curve.” 


Mukhopadhyay received an M.A. in mathematics from Calcutta University in 
1886. He studied much on his own, as is shown by entries in his diary:’* “Rose 
at 6.15 a.m. Read Statesman [Newspaper], and Boole’s Diff. Equations in the morning. 
Read Fourier’s Heat at noon.” And “At noon read from Messenger of Math. Vol. 2, 


68 Salmon (1879) p. 369. 
69 Boole (1877) p. 20. 

70 Mukhopadhyay (1889a). 
71 Edwards (1954a) p. 436. 
72 Mukhopadhyay (1998). 
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Prof. Cayley’s Memoir on Singular Solutions—to my mind, the simplest but the 
most philosophical account of the subject yet given; read from Forsyth on the same 
subject.” Mukhopadhyay published papers on topics in differential geometry, elliptic 
functions and hydrodynamics. The following abstract of his 1889 paper, On a Curve 
of Aberrancy, may give a sense of his mathematical work:”* 


The object of this note is to prove that the aberrancy curve (which is the locus of the centre of 
the conic of closest contact) of a plane cubic of Newton’s fourth class is another plane cubic of 
the same class, the invariants of which are proportional to the invariants of the original cubic; it 
is also proved that the two cubics have only one common point of intersection, which is the point 
of inflection for both. 


Mukhopadhyay thus gave evidence of a fine mathematical mind but his dream 
of spending his life in mathematical research could not be realized because there 
was no support for such endeavors in nineteenth-century Indian universities. As 
one biographer wrote, “Sir Asutosh’s contributions to mathematical knowledge were 
due to his unaided efforts while he was only a college student.” Interestingly, after 
serving as a judge, in 1906 Mukhopadhyay became Vice-Chancellor of Calcutta 
University and his first order of business was to have the University “combine the func- 
tions of teaching and original investigation.” He appointed Syamadas Mukhopadhyay 
(1866-1937) professor of mathematics and encouraged him to pursue research. 
S. Mukhopadhyay subsequently produced several interesting results, including the 
well-known four vertex theorem,’+ published in the Bulletin of the Calcutta Math- 
ematical Society, founded by Asutosh. Syamadas stated the theorem: “The minimum 
number of cyclic points on an oval is four.’ Asutosh created a physics department 
at the university with the appointment of the experimental physicist C. V. Raman, 
whom he persuaded to leave his post as an officer in the Indian Accounts Department. 
Raman went on to win a Nobel Prize in physics. In applied mathematics, he appointed 
S. N. Bose, known for his statistical derivation of Planck’s law leading to the 
Bose-Einstein statistics and M. N. Saha who discovered the Saha ionization law in 
astrophysics. These discoveries by Saha and Bose were made several years after their 
appointments, validating Asutosh’s confidence in them. 


14.12 Exercises 


(1) (a) Solve the equation ad — y = 0. Euler gave 
i x Xx 
y=aea + Bea 4 y sin ( A), 
a 
(b) Solve the equation ad +y=0. 


73 Mukhopadhyay (1889b). 
74 Mukhopadhyay (1909). 
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Euler gave 


xX 


eo a x 
=aqae v2¢ sin +A)-+ Bev2a sin ( +B), 
4 ( ) P /2a 


a 


See sections 33 and 34 of Euler’s paper E 62. 
(2) (a) Solve the equation X = y — ay Euler gave the solution 


1 1 
y= sen f exar— Set f e*xas. 


a 


(b) Solve the equation X = y + ae . Euler gave 


x x D 23 a. 
ea [eixas — — e% cos v3x + Z fe# cos V3x 
3a 2a 3 


2. Ba. V3x.% [8 sin 
ee, a sin = a Sin 
age jG 3 2a 


These are examples given in section 23 of Euler’s paper E 188. 


y= 


g|- 


X dx. 


(3 


wm 


Show that the polynomial 


k 


k(k—1)---(K-jtl ‘ 
yatiG@ys yo ee er yy, 
j=0 Jed: 


the kth Laguerre polynomial, is a solution of the recurrence relation 


(K+ Wyeg1 — Qk +1—x)ye +kyp-1 = 0, &k=0,1,2,.... 


D. Bernoulli and Euler encountered this equation in their works on the discrete 
analog of the problem of the small oscillations of a hanging chain. They 
discussed the discreet and the continuous forms of the problem while they 
were colleagues at the Petersburg Academy in the late 1720s and early 1730s. 
Bernoulli submitted his results to the academy before his departure in 1733 and 
a year later presented his proofs. Upon seeing Bernoulli’s work, Euler, who had 
obtained similar results, submitted his work. They took x = ai where a was 
the distance between the weights and @ was related to the angular frequency w 


by w* = g; g was the acceleration due to gravity. The yz was the simultaneous 


displacement of the kth weight. The chain was assumed to hang from the 
nth weight, so y, = 0. This gave L,(4) = 0 as the equation determining 
the frequencies. Euler discovered the polynomial solution of the difference 
equation; in that sense, he and D. Bernoulli were the first mathematicians to use 
Laguerre polynomials. Bernoulli found the smallest roots of these polynomials 
for some values of k, using his method of sequences; see Chapter 13 for Daniel 


Bernoulli’s work on difference equations. 
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(4) The differential equation satisfied by the displacement y at a distance x from 


(5 


(6 


wm 


wm 


the point of suspension of a heavy chain was determined by D. Bernoulli to be 


d’y dy 
So ae 


ya ae(af2) aS (ZY 


j=0 


Show that 


is a solution of this equation. Note that Bernoulli did not use the Jo notation but 
gave the corresponding series. The value of a is determined from the equation 


Jo oe = 0. Bernoulli stated that this equation had an infinite number 


of roots and gave the first value ¢ = 0.691, a good approximation. About 
50 years later, Euler gave the first three roots. Bernoulli and Euler may have 
conjectured the existence of infinitely many roots because the solution of the 
difference equation in the previous exercise, with a slight change of variables, 
approximates the solution of the differential equation given in this exercise. 
We remark that this is not surprising, as the first is a discrete analog of the 
second. In fact, Lk) — Jo(2,/x). And since Lx(x) = 0 has k zeros, Jo must 
have infinitely many roots; in fact, the rth root of Lx, must tend to the rth root 
of Jo. This can be proved rigorously by the theory of analytic functions of a 
complex variable, though Bernoulli and Euler obviously had no knowledge of 
this theory. 

In the works mentioned in Exercises 1 and 2, Euler also treated the equation 


x ay 
n+1 dx? dx a 


Show, with Euler, that the equation has the series solution 


n 1 
y= Aq ?In(2,/q), where q= mae 
a 
and 
2k 
= (5) 


micr= (5) 


This appears to be the first occurrence of the Bessel function of arbitrary real 


Tk +n +1) 


index n. Euler also proved that for n = — 5 the solution was y = Acos,/ 2x 


Prove this and show that this is equivalent to the result 7_1 (x) = ,/ = COS x. 
2 


Euler also obtained the solution of the differential equation in the previous 
exercise as the definite integral 
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a 


id = 12) "5+ cosh (2+ azn ) dt 
y 
A foQ—12)°> at 
Prove Euler’s result. It is equivalent to the Poisson integral representation 


Jn(x) = aa NG) [ cos(x sing) cos” ¢ do 
n —_ . 
Ja P(n+4) Jo 

Prove this. According to Truesdell, this may be the earliest example of 
solution of a second-order differential equation by a definite integral. For this 
and the previous three exercises, see Truesdell (1960) pp. 154-165 and Cannon 
and Dostrovsky (1981) pp. 53-64. The references to the original papers of 
Euler and Bernoulli may be found there. 


(7) Solve the equation 


—_, 


(ydx — xdy)(ydx — xdy + 2bdy) = C7 (dx? + dy’) 


by Euler’s method for finding singular solutions. 

Let f(x, y) be bounded and continuous on a domain G. Show that then at least 
one integral curve of the differential equation ay = f(x,y) passes through 
each interior point (x0, yo) of G. This result is due to the Italian mathematician 
Giuseppe Peano (1838-1932) who graduated from the University of Turin 
(Torino), where he heard lectures by Angelo Genocchi and Faa di Bruno. Peano 
developed some aspects of mathematical logic in order to bring a higher degree 
of clarity to proofs of theorems in analysis. This led him to produce several 
counterexamples to intuitive notions in mathematics; his most famous example 
is that of a space-filling curve, dating from 1890. Bertrand Russell wrote that 
Peano’s ideas on logic had a profound impact on him. See Peano (1973) 
pp. 51-57 for Peano’s 1885 formulation and not completely rigorous proof 
of the theorem stated in this exercise. A more stringent proof may be found 
in Petrovski (1966) pp. 29-33. In 1890, Peano generalized this theorem to 
systems of differential equations. That paper also contained the first explicit 
formulation of the axiom of choice; interestingly, Peano rejected it as a possible 
component of the logic of mathematics. He wrote, “But as one cannot apply 
infinitely many times an arbitrary rule by which one assigns to a class A an 
individual of this class, a determinate rule is stated here.” See Moore (1982) 
p. 4. Nevertheless, a logical equivalent of the axiom of choice, in the form of 
Zorn’s lemma, has turned out to be of fundamental importance in algebra. For 
the origins of Zorn’s lemma, see Paul Campbell (1978). 


(8 


wm 
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Enestrém (1897) has a good history of differential equations with constant coeffi- 
cients. For Leibniz’s early work on series, see Scriba (1964). An English translation 
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of Taylor’s Methodus Incrementorum was a part of Feigenbaum’s (1981) Yale doctoral 
dissertation. See Edwards (1954a) p. 436, for his remarks on Mukhopadhyay’s 
equation. For some discussion of Mukhopadhyay’s role in the development of 
mathematics in India, see Narasimhan (1991); for biographical information, see Sen 
Sen Gupta (2000). Katz (1998) and (1987) contain a discussion of how, in May 1739, 
Euler may have solved a*d*y — ydx? = 0. A lively history of the Riccati equation is 
available in Bottazzini’s article, “The Mathematical Writings from Daniel Bernoulli’s 
Youth,” contained in D. Bernoulli (1982-1996) vol. I, pp. 142-166. The reader may 
also wish to see Burn (2001) for the development of the concept of the logarithm in 
the second half of the seventeenth century, starting with the 1649 work of Alphonse 
de Sarasa. This topic is also discussed in Hofmann’s article on differential equations 
in the seventeenth century; see Hofmann (1990) vol. 2, pp. 277-316. 
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Series and Products for Elementary Functions 


15.1 Preliminary Remarks 


Euler was the first mathematician to give a systematic and coherent account of the 
elementary functions, although earlier mathematicians had certainly paved the way. 
These functions are comprised of the circular or trigonometric, the logarithmic, and 
the exponential functions. Euler’s approach was a departure from the prevalent, more 
geometric, point of view. On the geometric perspective, the elementary functions were 
defined as areas under curves, lengths of chords, or other geometric conceptions. 
Euler’s 1748 Introductio in Analysin Infinitorum defined the elementary functions 
arithmetically and algebraically, as functions. 

At that time, infinite series were regarded as a part of algebra, though they had been 
obtained through the use of calculus. The general binomial theorem was considered an 
algebraic theorem. So in his /ntroductio, Euler used the binomial theorem to produce 
new derivations of the series for the elementary functions. Interestingly, in this book, 
where he avoided using calculus, Euler gave no proof of the binomial theorem itself; 
perhaps he had not yet found any arguments without the use of calculus. In a paper 
written in the 1730s, Euler derived the binomial theorem from the Taylor series,! as 
Stirling had done in 1719.7 It was only much later, in the 1770s, that Euler found an 
argument for the binomial theorem depending simply on the multiplication of series. 
We discuss this argument in Chapter 4. 

We recall from Chapter 8 that before Euler, between 1664 and 1666, Newton 
found the series for all the elementary functions, using a combination of geometric 
arguments, integration, and reversion of series. In the course of his work, he 
also discovered the general binomial theorem. Later on, Gregory, Leibniz, Johann 
Bernoulli, and others used methods of calculus to obtain infinite series for elementary 
functions. Even before Newton, unknown to European mathematicians, the Kerala 
school had derived infinite series for some trigonometric functions, also using a form 
of integration. 


! Eu. I-14 pp. 108-123. E47 § 6. 
2 Stirling (1719). 
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Although the series for e* and e were already known, one of Euler’s major inno- 
vations was to explicitly define the exponential function. To understand this peculiar 
fact, recall that Newton and N. Mercator discovered the series for y = In(1 + x) in 
the mid-1660s.* Soon afterward, Newton applied reversion of series to obtain x as a 
series in y.4 Also note that for the eighteenth-century mathematician who took the 
geometric point of view, the basic object of study was not the function, but the curve. 
From this perspective, there was hardly any need to distinguish between the function 
and its inverse, since both curves would take same form, although with differing 
orientations. 

In a 1714 paper published in the Philosophical Transactions,’ Roger Cotes, in 
the spirit of Halley’s earlier work of 1695,° took the step of setting up an analytic 
definition of the logarithm. Cotes used this definition to derive the logarithmic series 
and then, by inversion, the series for the exponential. He proceeded to use the series for 
e to compute its value to thirteen decimal places. Incidentally, he also gave continued 
fractions for e and 2 to obtain rational approximations of e. However, Cotes focused 
on the logarithm, rather than its inverse. To understand how the lack of a clear 
conception of the exponential handicapped mathematics, consider that in the early 
1730s, as discussed in Chapter 14, Daniel Bernoulli was unable to fully solve the 


444 
ud — = y. He observed that the logarithm, meaning the inverse 


or exponential, satisfied this equation as well as the equation Ka * — y, but that no 
such logarithm was sufficiently general. Euler was also stumped by this problem, until 
he gave an explicit definition of the exponential and developed its properties in the 
mid-1730s. 

To derive the series for elementary functions, Euler made considerable use of 
infinitely large and infinitely small numbers. This method can be made rigorous by an 
appropriate use of limits, as accomplished by Cauchy in the 1820s.’ Following Euler’s 
style, Cauchy divided analysis into two parts, algebraic analysis and calculus. The 
former dealt with infinite series and products without using calculus, yet employed 
the ideas of limits and convergence. It is interesting to note here that in his Fonctions 
analytiques of 1797, Lagrange had attempted to make differential calculus a part of 
algebraic analysis by defining the derivative of f(x) as the coefficient of h in the 
series expansion of f(x + h). Gauss, Cauchy, and their followers rejected this idea 
as invalid. Besides providing greater rigor, Cauchy’s lectures presented original and 
insightful derivations of some of Euler’s results. 

In addition to defining elementary functions, Euler also showed that functions could 
be represented by infinite products and partial fractions. The latter could be obtained 
from products by applying logarithmic differentiation, a process he worked out in 
his correspondence with Niklaus I Bernoulli. In his Introductio, Euler presented 


differential equation 


3 Mercator (1668). 

4 Newton (1967-1981) vol. 1, pp. 112-115. 
5 Cotes (1714). 

© Halley (1695). 

7 Cauchy (1989). 

8 Bu. IV A-2 pp. 483-550. 
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fascinating ways of avoiding the methods of calculus involving differentiation and 
integration. He also gave an exposition on the connection, discovered earlier by Cotes, 
between the trigonometric and the exponential functions. As discussed in Chapter 12, 
Cotes had found the relation log(cos x + isinx) = ix, although this equation was 
more useful when Euler wrote it as cos x +i sinx = e'*. Of course, Cotes was unable 
to take this last step because he did not explicitly define the exponential e*. Euler, on 
the other hand, made use of this relationship to derive important results such as the 
infinite products for the trigonometric functions. 

At the very beginning of his career, Euler discovered the simple and useful 
dilogarithm function. The dilogarithm is defined by 


*In(l—t 2 3 
Li2(x) = yes Bh ae! (15.1) 
0 t 12 


where the series converges for |x| < 1. Euler initially investigated this function 
in 1729-30° to evaluate the series Dae +. He succeeded at that time only in 
determining its approximate value, but in the 1730s he found the exact value by the 
factorization of sin x. 

In the 1740s, the English mathematician and surveyor John Landen began publish- 
ing his mathematical problems in the Ladies Diary. In the late 1750s, he discovered 
that the dilogarithm could be used to exactly evaluate °° , 4,10 provided that 
logarithms of negative numbers were employed. Euler had already developed his 
theory of logarithms of complex numbers at that time but his work had not appeared 
in print. So Landen’s determination of In(—1) = +./—Iz in 1758 was an important 
and independent discovery. And he went further, by repeated integration, to define the 
more general polylogarithm, 


: X Xx x 
Li@=Gtytgt. (15.2) 


for k = 1,2,3, .... He could then evaluate the series pee iE fork = 1,2,3,.... We 


will discuss Euler’s and Landen’s evaluations of }> a in Chapter 16. 

The polylogarithm was further studied by the Scottish mathematician William 
Spence (1777-1815) who published his book, Essay on Logarithmic Transcendents, in 
1809.'! He derived several interesting results on dilogarithms, including the theorem 
for which he is known today. As a student in the 1820s, Abel rediscovered this formula, 
having been inspired to study the dilogarithm by reading Legendre’s three volumes 
on the integral calculus. This work discussed numerous results of Euler. Spence was 
apparently self-taught but, unlike many other British mathematicians of his time, he 
read Bernoulli, Euler, Lagrange, and other continental mathematicians. In the preface 
to his work, Spence commented!” on the disadvantage of British insularity: 


9 Eu. 1-14 pp. 25-41, especially pp. 38-41. E 20 § 22-23. 
10 Landen (1760). 
11 Spence (1809). 
12 Spence (1809) p. xi. 
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Our pupils are taught the science by means of its applications; and when their minds should be 
occupied with the contemplation of general methods and operations, they are usually employed 
on particular processes and results, in which no traces of the operations remain. On the Continent, 
Analysis is studied as an independent science. Its general principles are first inculcated; and then 
the pupil is led to the applications; and the effects have been, that while we have remained nearly 
stationary during the greater part of the last century, the most valuable improvements have been 
added to the science in almost every other part of Europe. The truth of this needs no illustration. 
Let any person who has studied Mathematics only in British authors look into works of the higher 
analysts of the Continent, and he will soon perceive that he has still much to learn. 


Interestingly, other British mathematicians were independently arriving at this 
conclusion. In 1813, a few students at Cambridge University formed the Analytical 
Society in order to promote broadening mathematical studies to include the works 
of non-British mathematicians. Among the members of this new Society was John 
Herschel, who collected, published, and annotated the works of Spence, !? an example 
of the progress facilitated by broader mathematical horizons. In one of his extensive 
notes on Spence’s work, Herschel presented, without fanfare, his own discovery of 
the Schwarzian derivative. Interestingly, in 1781 Lagrange also found this derivative, 
but in the context of cartography.!4 We also note that Kummer, before he became a 
committed number theorist, wrote a very long 1840 paper on the dilogarithm;!> this 
paper contained a wealth of results, including the rediscovery of Spence’s formula. 


15.2 Euler: Series for Elementary Functions 


Euler defined the exponential functions in about 1748!° by explaining the meaning of 
a*, first for z as an integer and then for a z as a rational number. He remarked that for 
irrational z the concept was more difficult to understand but that av! , for example, had 
a value between a” and a* when a > 1. He noted that the study of a* for0 <a < 1 
could be reduced to the case where a > 1. He then defined the logarithm: If a* = y, 
then z is called the logarithm of y to the base a. Euler did not have a notation for the 
base of a logarithm and always expressed the base in words. According to Cajori,!7 in 
1821, A. L. Crelle, founder of the famous journal and a friend of Abel, introduced the 
notation for the base of the logarithm, writing the base a on the upper left-hand side 
of the log. However, we employ the modern notation: log,,. 

To obtain the series for a*, Euler observed that since a? = 1, he could write a” = 
1+kqa, where w was an infinitely small number. Thus, by the binomial theorem, called 
the universal theorem by Euler, 


iG-D,2 9 | IGG = Dea feted: 


, a 
Jo = (1+kow)/ =1+ kw 
ile {AB 1-2-3 

(15.3) 


13 Spence (1819). 

It Lagrange (1781). 

15 Kummer (1840). 

16 Euler (1988) chapters 6 and 7. 
!7 Cajori (1993) vol. 2, p. 107. 
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He took j to be infinitely large so that jw = z, a finite number, and equation (15.3) 
was transformed to 


kz\/ 1 Wj-1 Gs DG=2 
«= (1+) ee oe G Poe Gj G Pe 


1 ey: [37-37 
(15.4) 


For infinitely large 7, he concluded that ie = 5, G Sia ie = 4 etc. and hence he 


had the series 


kz k222 k323 
Req eames 15.5 
" i dep dees oe) 


He then set z = | to obtain the equation for k: 


ate aie ke | ke | (15.6) 
ee ae hope ieee es 


He denoted by e the value of a when k = 1 and computed it to 23 decimal places. 
From (15.5 ) and (15.6), with a = e, Euler obtained these famous equations: 


= 3 

4 4 
sake fj sicette. 15.7 
. oie ORS ot) 
—-1+14¢4 ! | 1 | (15.8) 
ciemmy Daas ey Pe eae 


It also followed from (15.6) and (15.7) that k = Ina where In stands for the natural 
logarithm. To find the series for log,,(1 + x), Euler set ail? = (1+kw)/ =1+4+x, so 
that 


j 1 
jo =log,(1 +x) = (a L x)i i) (15.9) 
and he then expanded the expression in parentheses by the binomial theorem to get 


ue i= 1 9, 1G-DRi-1) 3 2 
ENG geet 27-3) 


i ye 
EN oe TOR 


The second equation in (15.10) followed from the condition that j was an infinitely 
large number. When k = 1, 


log, +x) = 
(15.10) 


2 x3 


In(1 +x) = log +x) =2—-S + —e, (15.11) 


To obtain the series for sin x and cos x, Euler started with de Moivre’s formulas 


_ (cosz+V—l sin z)” + (cos z — /—1sin z)” 


5 (15.12) 


COS NZ 


370 Series and Products for Elementary Functions 


. (cos z + /—1sinz)”" — (cosz — /—1 sin z)” 
sinnz = : 
2/-1 


By the binomial theorem, equation (15.12) could be written as 


(15.13) 


cosnz = (cos z)” — ne (cos z)"~?(sin z)* 
—1)(n—-2)(n -3 
pO Ce08 29" Hsin 294 = + 


Euler took n infinitely large and z infinitely small, such that that nz = x was finite. 
He then concluded that sin z = z = * and cos z = 1, and hence 


x x ge 


cosx = 1 | free, (15.14) 
1-2 1-2-3-4 1-2-3-4-5-6 


Similarly, Euler found the series for sinx from (15.13). Since we have discussed at 
some length the series for sine and cosine in connection with the works of Madhava, 
Newton, and Leibniz, we will not reproduce Euler’s derivation of the series for sine, 
very similar to his derivation of (15.14). 


15.3. Euler: Products for Trigonometric Functions 


Euler derived the infinite products for the sine and cosine functions!® from the Cotes 
factorization of x” + y”. Note that Cotes’s formula for n odd, given in Section 12.4, 
may be expressed as 


(n—]) 
2 


2k 
gy 9) |] (x? 21y cos ™ +-»?), (15.15) 
n 
k=1 


Euler observed that the series for the exponential, cosine and sine functions, with j 
infinite, yielded the relations: 


ex + ea (1 4)" } (1 4)! 
cosx = 5 = 5 : (15.16) 
\ J J 
jie Se ( 7) ( i) (15.17) 
- oy i 2i , ; 


He first determined the factors of eX — 1 = (1 + x)! — 1 by Cotes’s formula. He 


noted that one factor was ( 1+ *) -1l= ; and the quadratic factors were of the form 
(i+ x)? —2(1+ *) cos (2k=) + |. He also noted that every factor could be obtained 
by taking all positive even integers 2k. Euler then set cos (2k) =l1- we by taking 


18 ibid. chapter 9. 
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the first two nonzero terms of the series expansion for cosine and then simplified the 
quadratic factor to 


ARP AI > AN so, ae 
7a ra ecemeerreme| Sie 
Pe 3 Pz j Ak2 772 


He observed at this point that though 7 was infinitesimal, it could not be neglected 
because there were J factors, producing a nonzero term 5. This remark shows us 
that Euler had some pretty clear ideas about the convergence of infinite products. To 


eliminate this difficulty, Euler then considered the factors of eX — e~* = (1 + *)/ _ 


(1 - 4)! . In this case, he simplified the general quadratic factor to 


: 2 REC j 2 
The contribution of the term Fa after multiplication of Z factors was s and he 


: : fs 2 
could now neglect this. So Euler determined the quadratic factors to be 1 + i and 
wrote the formula 


et — en ge Py: ee) oe 
——_ = 1+—)(1+—){14+-—4 }](14+—)-:-. 15.1 
2 +( +5)( +25)( +35)( +i) oe 
Similarly, he got 
e~+e* 4x? 4x? 4x? 4x? 
=(14 14 14 14 mee, 15.19 
g ( mn? )( a7) =a) a) 


To obtain the products for sin x and cos x, he changed x to ix to find 


x2 x2 x? x2 
i = 1 1 1 1 see 15.2) 
ae x( =) aa) =a) =<) oe) 
4x? 4x? 4x? 4x? 
={1 1 1 1 see, 15.21 
a ( 2 )( =3)( =7)( a) aaa, 


From (15.20) it appears that the constant factors somehow cancelled. To see this 


more clearly, take n = 2m +1, X =1+ WAI Y=1- a JI in (15.15) to arrive at 


and 


x2m+i = y2m+1 


2/-1 
m 2 2 
2k 
spel 2(1 Z cos ud 1 a 
2m+1 a (2m + 1)? 2m+1 (2m + 1)? 


2” Tet (1 = EOS pit.) = ( 1+ cos ~4t se 
=l 


XxX . 
2m +1 i 1 —cos wet (2m + 1)? 
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To simplify, observe that 


am+l “ 2k 
lian = lim |] x? +1—cos ae 
x>1 x—-l x>1 mee 2m+1 
or 
m 
2kx 
2 1=2” 1- : 
m+ I] ( cos ice :) 
k=1 
Thus 


y2m+l _ y2m+1 


Il ‘ 1+ cos ee x2 
———————— i X. . 
2/-1 1—cos “2. (m+1)? 


k=1 


Now for each k and with m infinitely large, we see that 


1 + cos sam x2 = 2 x2 
2ka a ken? 2 
1—coss= (2m + 1) nate (2m + 1) 
ge 
kg? 


and therefore 


15.4 Euler’s Finite Product for sinnx 


In his 1748 Introductio in Analysin Infinitorum, Euler also gave an elegant proof of 


2 
sin(2m + 1)x = +27" sinx sin{ x + at 
2m +1 


: i An : i 4ma 
x sin ++» sin 
‘< 2m + 1 “ 2am+1/)’ 


with n = 2m + 1, an odd integer, and where the positive sign would be employed 
when m was even and the negative sign when m was odd. Though there are various 
methods of proof, Euler’s was not only very nice, but also could be applied to derive 
other identities. Euler’s proof relied upon a result well-known at that time: Suppose 
a1, 42, ...,d42m+1 are the roots of 


(15.22) 


l—cy4 coy? eat (1) Coms. yee = 0. 
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Then 


Per ey? eae 


=(1-2)(1-2).-(- y i; (15.23) 
a a2 a2m+1 


Moreover, multiplying out the factors and then equating the coefficients, one 
obtains Girard’s relations: 


tye =i, (15.24) 
a, a2 a2m+1 
1 1 1 
ai a2 a2m+1 


in this connection, see Section 13.3. 
Euler began his proof!® by noting the result of Viéte, later stated by Newton, in 
more general terms: With n = 2m + 1, 
PO la Me CC el cae 
sinnx = nsinx sin” x 4 sin™ x 
1-2-3 1-2-3-4-5 
2_ 42 2 2 
— 12)... —~(2m —1 
(ayn eS Dah 856) 
1-2-3---Qm-+1) 


For a discussion of (15.26), see Sections 8.5-8.7. Euler observed that when x is 
assigned the values a atom ee Eben IE sinnx keeps the same value. Observe that 
if n and x are fixed, then sin nx, the left-hand side of (15.26), is a constant and (15.26) 
is a polynomial of degree n = 2m + 1 in y = sin x, with roots 


_ 2 . 2420 . 2t2(n—1)0 
a, =sin-, a2 =sin yt, dy, = Sin —————_., 
n n 


n 


Thus, a1,a2, ...,@, are the roots of the equation obtained after dividing (15.26) by 
sinnx. Euler wrote this equation as 
2 2 n-1 
—1 2 
BO = n_ 0, (15.27) 


= y z y ; 
sinnx 1-2-3-sinnx sin nx 


since the coefficient of y” in (15.26) has the absolute value 


n(n? — 17)---(n —(2m—1)*) — 2mm + 2)(2m — 2)(2m +4) ---2(4m) 
nl 7 (2m)! 
2m (2m)! _ 9n-1 
(2m)! 


19 Buler (1988) pp. 205-206. 
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So with a, = sin 7 “+ dy, = Sin = equations (15.25) and (15.27) imply 
(15.22). On the other hand, as Euler noted, (15.24) and (15.27) produce 


n 1 1 1 1 


= beset Ea EAN, 
sin (« + 2) 
(15.28) 


: : T T 
sinnx sinx gin (« re 2x) win (« te x) 


Now observe that, except for the constant 1, only odd powers of y appear in 


equation (15.27); thus, the coefficient of y’~! = y?! is zero. This gave Euler the 
relation 
: ( =) : ( =) ( a ) 
0 = sinx + sin|{ x 4 + sin| x +4 +++ sin{ x4 : 
n n n 


(15.29) 


Euler made use of all these formulas to establish many more trigonometric 
identities, as given in chapter 14 of his Introductio, and Cauchy employed (15.22) and 
(15.28) to determine the infinite product for the sine function by a method different 
from Euler’s, without using Cotes’s factorization. 


15.5 Cauchy’s Derivation of the Product Formulas 


In his lectures at the Ecole Polytechnique, published in 1821 under the title Analyse 
algébrique,*° Cauchy gave a treatment of infinite series and products more rigorous 
than Euler’s. Cauchy then applied these ideas to a discussion of elementary functions; 
his discourse on infinite products was presented in note IX as the last topic in the 
work. Suppose uo, “1, U2,... to be real numbers with u, > —1. Cauchy started his 
discussion of infinite products with the observation that if the series 


In(1 + wo) + In + uw) + In(1 + uz) +--- (15.30) 
converged to s, then the sequence 
Pn = (1+ uo) + u1)-+-A + un-1), 2 = 1,2,3..., 


converged to a finite limit different from zero, equal to e*. Thus, according to Cauchy, 
the infinite product 


(1 + uo) +41) + u2)--- (15.31) 


was said to converge if lim (dl +uo)---A+ Un)) existed and was different from 
n—->Co 


zero. He then stated the theorem: Suppose the series (15.30) converges. If the series 


20 Cauchy (1989). 
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boa Piao (15.32) 

and 
us + ui + uy +... (15.33) 


are convergent, then the infinite product (15.31) converges. However, if (15.32) is 
convergent and (15.33) is divergent, then the infinite product diverges to zero. 

In his proof, Cauchy first noted that the convergence of (15.30) implied that 
Indi + un) — Oasn — o, and this in turn implied that u, — Oasn > ov. 
He next observed that for large enough n, 


I , I 2 
Indl + un) = Un — <u + eu + = uy — cute), 


with €,, infinitesimally small. He concluded that 


Indi + uy,) + Ind + ungi) +--+ +1nd + untm-1) 


1 
ee aan eters a ee Un bury tee tue) £€), 
(15.34) 


when all the wu had absolute value less than one and 1 + € was the average of 1 + €y, 
1+€,41, .... Formula (15.34) completed the proof of the theorem, because the infinite 
product converged if and only if the series In(1 + uo) +In(i+u;)+Ind +u2)+--- 
converged. As examples of this theorem, Cauchy noted that the product 


2 2 
(4 A(t ie )( a )-- 
y 32 


converged for all x, while the product 


1 1 1 
14+1){1 14 1 toe 
a+n(1-a)+ a) Zz) 

diverged to zero. 


Although earlier mathematicians did not explicitly state such a theorem on infinite 
products, it can hardly be doubted that Euler, with his enormous experience manipulat- 
ing and calculating with infinite products and series, intuitively understood this result. 
However, Cauchy’s presentation — with clearer, more precise and explicit definitions 
of fundamental concepts such as limits, continuity, and convergence — paved the way 
for future generations of mathematicians. His work led to higher standards of clarity 
and rigor in definitions, statements of theorems, and proofs. 

While Cauchy was a pioneer in the establishment of rigor in mathematical 
arguments, we comment that his lectures nevertheless contain many arguments in the 
looser style of that time. This was perhaps a concession to his students, who apparently 
chafed at the rigorous approach. Thus, Cauchy’s proof of the infinite product formula 
for sinx, located between equations (11) and (16) inclusive of note 9 in his book, 
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contained less-than-rigorous statements about the averages of various quantities. In 
his main argument, that includes equations (12) and (13), for example, he dealt with 
2n quantities, Ay ~ B,,A2 & Bz,... A, & B,. He wrote that if 1 + a denoted the 
average of the quantities 


A, A> An 
By,’ Bo” By’ 
then 
Aj +Ag+-:-++Ay = (Bi + Bo +--+ + B,)1 +a). 


However, because the two ’s are not necessarily identical, it would require a good 
amount of work to verify this. Cauchy makes a similar jump in reasoning, that would 
have required a very complicated argument to justify, in a result involving a quantity 
he labels 1 + 6. Nevertheless, Cauchy actually had all the results necessary for a 
complete proof of the infinite product formula for sin x and our presentation here is 
based exclusively upon Cauchy’s own results and methods, as presented in his Analyse 
Algébrique, especially note 8. 

For our proof, first note that 


. ( =) ; ( o*) 2 _ 9 20 
sin{ x + — )} sin{ x + ———— } = sin’ x — sin“ —, 


n n n 

; ( =) : ( a5) “3 940 

sin{ x + — } sin{ x + ————— ] = sin’ x — sin* —, ete. 
n n n 


Hence, by Euler’s formula (15.22) for odd n, 


sinnx 
; Seem | 240, eae 0 
= 2"~! sin x (sin? — — sin? x} (sin? — — sin? x} --- ( sin? ———— — sin? x]. 
n n n 
(15.35) 
It is easy to see that the set of at numbers sin? an sin? az ..., sin” wr is 
identical to the set of tt numbers sin” a sin? 2m ..., sin? ee Thus, 


; alive soph ot pl 9, _2(—l)r, 
sinnx = 2”~'sin x(sin? — — sin? sin? — — sin? x) --- (sin? ~_——~—_ — sin? x}, 
n n 2n 


in we which let x — 0 to get 


ap BUTTE nen DTT _5(n—1)x 
nae sia = Sin gine SS 
n n 2n 
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After replacing nx by x, 


. 2 (1 sin’ z)(1 sin? * ) (1 sin? * ) 
sinx =nsin .){ |] — — 4 _ ]}, 
n sin” sin? 2 sin? @— Dz 
n 2n 


given as equation (18) in note 8 of Cauchy’s Analyse Algébrique. 
When n — 0, we get 


sin? * x? 
Ure pa 
sin” k* 


Take x > 0 and let m be any fixed number less than 2>!, chosen so that 


2 
(m+ 1)x? > 4x2. Then write 
sin? * sin? * 
sinx =nsin— are see f b— 2 
n sin a sin nt 


(1- sin 5 ) (1 sin n ) (15.36) 
x —__"_ }..-{ 1 - —__+~_ ]. : 
sin2 sD sin2 (n =D 


Now because [ |} (1 — a) > 1 — 7} aj, we have 


(1- sin” 5 ) (1- sin” - ) 
sin2 oiED sin2 (ala at 


<i 
a (ox £45 tee =). (15.37) 
n n 2n 


Our calculations are now similar to Cauchy’s derivation of equation (30) from (29) 


in his note 8. Since for 0 < x < ot we have x < 2sin x; it follows that 


~2%* 
sin“ — csc < ; (15.38) 
n a 


Thus, the product in (15.37) is greater than 


4x? 1 1 4 4x2 
fees > 1- ——_. > 0, 
m2 \(m+1)2 (m+2) (n — 1)2 m2(m + 1) 


and less than one. Therefore, it is equal to 1 — 6,, where 6, lies between 0 


and 1. Thus, (15.36) can be written as 


ae > 
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ee 5 a 5 2 
sin’ = sin’ = 4x26 
sinx =nsin~(1— Te cea at (1 = ” ) (15.39) 
n sin’ = sin’ 2% m-(m + 1) 


n 


Also observe that 


and 
On > @ (say) asn > ow. 


Finally, letting n — oo in (15.37), we arrive at 


2 2 2 
; x x 4x 
sinx = (1-5) (1 =) (1 aay): 


The result follows by letting m — oo. The restriction x > 0 can now be removed. 


15.6 Euler and Niklaus I Bernoulli: Partial Fraction Expansions 


Recall that Euler obtained the partial fractions expansions of csc x and cot x by the 
use of integration methods and, later on, by other methods. The former approach 


; ; ore : : 
depended upon the evaluation of the integrals ie. “ - in two different ways. First, 


Euler used Cotes’s factorization of 1 + x? to express wo as partial fractions; he then 
integrated to find 


oo y,p-l 
j eS (15.40) 
o l+x4 q sin re 
and 
oo xPol 1 
/ — dx = = (15.41) 
0 —Xx g tan "is 


where 0 < p < q and p, q were integers. For Euler’s derivation of (15.40) and 
(15.41), see Section 12.6. The integral in (15.41) is a principal value, although this 
concept was not explicitly defined until Cauchy introduced it in the 1820s. Dedekind 
gave a number of proofs of (15.40), including a streamlined form of Euler’s proof, and 
we present that in Section 17.8. 

For Euler’s derivation of the partial fractions expansion of csc x, note that the 
change of variable y = ‘ shows that 


oo yp-l 1 ,q-p-l 
i. A ay = 2 dy. 
1 l+x4 g Ao 4 
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So Euler could rewrite the integral (15.40) as 


1yp-l q-p-1 1 
i ote a=f (xPoE xf PHN) = x4 4x74 — FE -) dx 
0 0 


1+ x4 
1 
=| (POL gare gates! gee gael 4 sap ol ax 
0 
vest of 1 1 1 1 1 
P 4a-p q+p 2%4q-p 2%2W+p  3q-?~P 
IU 
q sin % 


Thus, he had the partial fractions expansion for csc x 


cr 1 1 1 1 1 1 
i a | 1 l+x 2-x | 2+x | 3-—x oe 
sinawx x x 
1 2x 2x 2x A 342) 
= + etc. 


x x2-12 72-22 x2 32 


In a similar way, he obtained the partial fractions expansion of cot x: 


1 

tanzx x l—-x l+x 2-x 2+x 3-x 34x 
1 
x 


2x 2x 2x 
T aL, Ve ae ee 


x2 a 12 
(15.43) 


Note that by integrating (15.43), Euler had another way to obtain the product for 
sin x. 

In a letter of January 16, 1742,2! Euler communicated these results to Niklaus I 
Bernoulli, also observing that partial fractions expansions of other functions could be 
found by repeated differentiation of (15.42) and (15.43). In particular, he had 


n*cosax 1 1 1 1 1 1 ‘ 
= etc. 
(sinax)2 x2 (l—x)? (l4+x)? @Q-x)?) @+x)? G-x) 
(15.44) 
and 
we 1 1 1 1 1 
+ etc. (15.45) 


Gnaee 2  US=ee Ge? =m ay 


In his reply of July 13, 1742,? Bernoulli noted that the logarithmic differentiation 
of Euler’s product for sinx would immediately produce (15.43). To see this, we 
note that 


21 Bu. IV A-2 pp. 483-490. 
22 Eu. IV A-2 pp. 491-500. 
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22. 2 
Insinazx = In(wx) + Yn(1 = =) 
n 


n=1 


differentiation yields the required formula. In his subsequent letter of October 24, 
1742,?3 Bernoulli noted that (15.42) could also be obtained by logarithmic differenti- 
ation. In his letter, Bernoulli wrote the differential d In x as differ.In x. His notation for 
the natural logarithm was log. We maintain our practice of writing In x, and present 
Bernoulli’s argument as he wrote it: 


differ. nS 2* _ differ.sinwx differ. cos wx 
Cos 


UX sin Xx COS TX 
m dx COSmX max SiN TX _ m dx 
- sin Xx cosmx  sinwx cosmx 
1 1 
da dx mwx(1— xx) (1 — ix) (1 — sxx) etc. 
= — = differ. In 
sin 27x (1 — 4xx) (1 — $x) (1 — Axx) etc. 


Replace 2x by x to get 


are fag SIX (1 xx) (1 xx) (1 = 4) etc. 
——— = differ. In 
sin wx (1 — xx) (1 = $x) (1 = 35%*) etc. 


1 1 1 1 1 
=dx | + etc. }, 
x 1-—x l+x 2-x 2+x 


which yields (15.42) after division by dx. 

In his Introductio,?4 Euler gave an alternate derivation of the partial fractions 
expansions of the trigonometric functions, avoiding the use of integration and 
differentiation. He first showed by Cotes’s formula that the quadratic factors of 


J = aN 
e+ece%= (1 *) (1 . ~) 
J J 


were of the form 


with odd m, and hence 


e+e-e Acy — y? Acy — y* Acy — y* 
=—(1 1 1 etc., 
1+e m2 +2 On? + c? 25m? + c? 


23 Bu. IV A-2 pp. 532-550. 
24 Buler (1988) pp. 130, 134, 135. 
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where the denominator on the left-hand side was chosen so that the value of both sides 
of the equation was | for y = 0. Euler then took c = imx and y = i754, so that the 
left-hand side reduced to cos(z 5) + tan(z 5) sin(z 5). This gave him the formula 


ITV UX | Hv v v v v 
cos + tan sin =—(14 1 1-4 1 etc. 
2 2 2 ( =) )( )( ss) 


In a similar way he derived 


ITV UX . HV 
cos + cot sin 
2 2 2 


-(42)O- 2 Jet )-as\(eats)« 


By equating the coefficients of v in the two equations, he had respectively 


Tu UX 1 1 1 1 1 1 
tan = } t + etc. 
2 2 l-x l+x 3-x 3+%x %S-x 54x 
and 
T UX 1 1 1 1 1 
cot — t } etc 
2 2 x 2-x 24+x j4-x 44x 
Since 


a UX UX IN 
tan + cot =- , 
2 2 2 sin wx 


Euler was also able to obtain the partial fractions expansion of =". Note that in 


in section 225 of the second part of his 1755 differential calculus book, Euler replaced 
x by 5 — x in (15.42), and if we do that we get 


4 43 4.5 | asus 
sec = } oe ; 
POE (ade 32S ae: Seaweed 


15.7 Euler: Logarithm 


Euler defined the logarithm of a number as a multivalued function in his 1751 paper, 
“De la controverse entre Mrs. Leibniz et Bernoulli sur les logarithmes des nombres 
negatifs et imaginaires.””> Euler started his paper by discussing the positions of Johann 
Bernoulli and Leibniz on the logarithms of negative numbers; he proceeded to show 
that each point of view led to a contradiction. He then stated the theorem that resolved 
the difficulties: If y denotes the logarithm of any number x, then y must contain an 
infinity of different values. 


25 Eu. 1-17 pp. 195-232. E 168. 
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To prove this theorem, Euler first observed that if @ were an infinitely small number, 
then log(1 + w) = @, so that 


logd +)" =no. 


He reasoned that n must be infinitely large because (1 + w)” = x, where x denoted 
any given number. With n infinitely large, 


x=(1+o)" and logx=nw=y. 


To express y in terms of x, he observed that 1 + w = i, Hence, w = xo — 1 and 
y=no= nxt —n = log x. Thus, as the value of n in nxt —n became larger and 
larger, nx —n would come closer and closer to its true value, log x. Euler argued that 
x2 had two values, x3 has three values, and so on; therefore, xn would take an infinity 
of different values with n infinitely large. Thus, the theorem was that log x must have 
infinitely many values. 

After proving this theorem, the first problem posed by Euler was to find all the 
values of log a, with a a positive number. He noted that since a was positive, one of 
the values of log a had to be a real number A. Since loga = log1-a = log1+ A, 
he needed to find all the values of log 1 = nlt —n= y. He thus had to determine all 
values of y that were solutions of the equation 


(1 4 *\' car (1 fi *)" 450, (15.47) 


To factorize ( 1+ 2" , he noted that a typical factor of p” — q” would be 
p* —2pq cos aie q°, where } = 0, +1, +2, .... He observed that the solutions of 


n 


equation (15.47) with p = 1+ 2 and g = 1 were the solutions of 


(1 | a 2(1 | *Y cos At 0 P= PLAS... T1549) 


and these were: 


2x 2x 
Pane tos 2 ng 1 sin - 
n n n 


200 200 


Because ~~ was an infinitesimal, cos =* thus, 


= Land sin“ = 24m, 


=>? 


2r 
(oo Se Sh oe ey 1 = OND 
n n 
Euler had thereby found all the values of loga as: A+ 2Am /—1,4 = 0,1,2,.... 
The second problem posed by Euler was: Find all values of log(—a), where a > 0. 
This amounted to finding all values of log(—1). In that case, all values y of log(—1) 
would be solutions of 


(1 a \" aaah: (15.49) 
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Euler noted that all solutions of p” + g” = 0 could be found by solving the 
quadratic equations 


(2X4 — 1) 


p* —2pq cos +q° =0, Ae OSE Ee Dk, 


Thus, (15.49) had solutions 


2x-1 2rx-1 
pie aexeaes Lara, isi LRN eee 
n n n 


and, since n was an infinite number, 


2r.-1 
je CDE A=0,+1,+2,.... 


n 


As a third problem, Euler proposed to find all values for the logarithm of a complex 
number a + b./—1, beginning by noting that 


b 
a+b/—-1=c(cos¢+~-—I1 sing), where c= Va? +b? and tang = — 


- 
He reasoned that since c was positive, one of the values of log c must be real. Call 
this value C so that 


log(a + bV-1)=C+ log(cos @ + J/-1 sing). 


Now since 
n 
lan V—-1 
cos@¢+~/—1 sind =e “Tb ( + 4) ; 
n 


the values of log(cos ¢ + /—1 sin @) would be given by the solutions of 


V¥—l 
p" —q" =0, where p=1+- and gq=1+ D 


Moreover, because 


Euler arrived at 
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or 
y=(~+2an)V-1, 1=0, +1, £2,.... 
With C = 5 log(a? + b*) and tang = B. he obtained his result: 


logfa+bV-1)=C+@vV-14+2An7 V—-1, A=0,+1, +2,.... 


1 

In modern notation and terminology, if we set z = a + ib, then |z| = (a? + b?)2 

and arg z = @ + 2Az, where arg z denotes the argument of z, a multivalued function. 

We now designate the value of arg z where —z < argz < mz as the principal value 
of log z: 


1 
logz = 5 los lel t+iargz, —m <argz <7. 


15.8 Euler: Dilogarithm 
The dilogarithm function Li2(x) can be defined for —1 < x < 1 by the series 


ee) ae 
Linx) =x + mt 


ee (15.50) 


This series is obtained when the series for | log(1 — t) is integrated term by term. 
Thus, we have 


x Ind — x 2 
Lia) =— [ a= f (145454) ar (15.51) 
0 t 0 2 3 


Note that the first integral in (15.51) does not require that x be restricted to the inter- 
val —1 < x < 1. In fact, if we use the integral definition of Liz (x), as given by the first 
integral in (15.51), then x can be given any complex value and, because the logarithm 
is many-valued, Li2(x) must be a many-valued function. Now observe that since 


log — t) = log |1 — t|+iarg(d — 14), 


the principal value of Liz(x) has the condition —a < arg (1 — x) < x. Observe that 
for x > 1, the principal value of log(1 — x) is 


log(—1) + log(x — 1) =ia +log(x — 1). 
Thus Li2(x) contains 


[> 
—in | —=-—imzlogx 
x 


in its imaginary part. 
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The dilogarithm series (15.50) arose in a 1731 paper of Euler,”° the purpose of 
which was to evaluate, when s = 2, 


ea | 
f6)=) 0 = s>l. 
n=1 


Though Euler succeeded in finding only an approximate value for ¢ (2), he obtained 
an interesting and useful formula for the dilogarithm: 


. | 
Lipn(x) + Lib — x) = se aa Inx In(1 — x). (15.52) 
n=1 


Euler wished to find an approximation for the sum of the right-hand side of (15.52). 
Note that this series converges very slowly; one thousand terms of the series would 
yield an approximation to three decimal places. Euler took x = 5 and got 


S11 a 
= 2 
s a= > arai,3 t (log2)°. (15.53) 
n=1 n=1 
The series on the right-hand side in (15.53) converges much more rapidly and 
Euler found its approximate value to be 1.164481, though he did not mention how 
many terms of the series }-7~_, eae were required to obtain this approximation. He 


approximated (log 2)” as 0.480453 and obtained 
ar 
) — © 1.644934. 
n 
n=1 


Euler’s proof of (15.52) was slightly complicated, partly because he first proved 
a more general result and then derived his result as a particular case. We therefore 
reproduce the brief argument given by Abel:*’ 


xq = 1l-x ] 1— 
Lin (x) + Lip(l — x) = / eet Dy i vee Dig 
u 0 


1 = x = 1 = 
= i log — t) / log( — t) ie / log(1 — t) dt 
0 0 1 


t t 
CO 
1 */log(—t log t 
ae [pC ) log a 
n 0 t 1-t 


— 1 
= LS —z — logx log(1 — x), 


= t 


26 Eu. 1-14 pp. 25-41, especially section 22. E 20. 
27 Abel (1965) vol. 2, pp. 189-193. 
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where the final step followed from the observation that the integrand in the last 
integral was the derivative of logt log(1 — t). This same argument was given by 
the British mathematician William Spence*® in his 1809 work on what he then called 
logarithmic transcendents. At that time, Spence was unaware of John Landen’s 1760 
paper containing the identical reasoning. In fact, Spence learned of Landen’s work just 
as the book was going into publication, so he mentioned Landen in the preface. 


15.9 Spence: Two-Variable Dilogarithm Formula 


In his essay on logarithmic transcendents,”? William Spence remarked that Euler and 
Bernoulli had only one variable in their formulas for the dilogarithm, whereas he 
himself used more unknowns. Spence worked with the function defined by 


2 * dt 
ia+o= | — In(l +2). (15.54) 
0 


1 
Note that he wrote In(1 + t) as L (1 +t). Spence observed that when |x| < 1: 


In(l+t 1 1 1 
a) 4 Pep aap es Sere 
t 2 3 4 
and so 
2 x ee ae 


He then gave a simple proof of the formula 


L(+ mx) +nx)) = L + mx)+ £ (+n) —) 


2 (m+n+mnx m+n+mnx n(l + mx) 
L + In -In{ ————— 
n m m 


2 
rin( ts me) (SEE) : (in) 127 (2). (15.55) 
n n 1.2 n 


He expressed the formula (15.54) as 


2 dx 
L0d+x)= / ae In(1 +x), (15.56) 


and worked with the integral as if it were an indefinite integral where the constant 
of integration was computed in the final step. With this in mind, he replaced x by 
(m +n)x + mnx? to get 


28 Spence (1809). 
29 Spence (1809) sections 9 and 10. 
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2 dx mndx 
L (a+ maycl-+n) = | ( ) in (4+ msyc1 + m0) 


x m+n+mnx 


dx dx dx 
= Indi + mx) 4 Indi + nx) 4 Indi + mx) 
x x m+n+mnx 


(15.57) 


2 2 
By definition, the first two integrals were 1 (1 + mx) and L (1 + nx), respectively. 
Letting z denote the sum of the last two integrals and setting v = m +n + mnx, he 


obtained 
dv vu-—m dv u-—n 
i= / In i In 
v n Vv m 


d 
| “in(2 i) fon? 1) 
Uv m U n 
2 9: 
m m m n n n 
where the last step involved integration by parts and the value of the constant C was 
found by setting x = 0. This completed Spence’s proof. 


Abel rediscovered Spence’s formula, with a different proof; it first appeared in his 
collected papers in 1839.°° Abel stated his formula as 


: x y 2 y ‘ x 
E yee i ae 
ay ay (3) | (5) 


— Lio(y) —Lio(x) —In(i — y) n(’—x). (15.58) 


Abel gave a simple and elegant proof of this formula: He let a denote a constant. 
Then it was easy to check that 


Lin( ieee s )- [(@: 2) in eee aw (15.59) 
baa Ley y l-y (d—a)d—y) 


= f2n(1 sd )+ [Pina y) [2 n(1-*_) 
y l—a y l—y l—y 


30 Abel (1965) vol. 2, pp. 189-193. 
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and (15.59) could be verified by taking the derivative of both sides. To evaluate the 
last integral, in (15.60), Abel set 
a a adz 


or 1—y=-— and dy =—- 
z z 


so that 


[2 (1 : Ve ae 2) = Lin(2) + € = Lia( . )+e. 
eet = z l-y 


Thus, he had 


‘ a y é y 4 a 
iB poe aye ae 
ee 25) “() | “() 


— Lin(y) — In(1 — a) Ind — y) +C. 


To find C, Abel let y = 0 to get C = —Lip(a). After a was replaced by the variable 
x, this proved the formula. 


15.10 Schellbach: Products to Series 


In an 1854 paper*! published in Crelle’s Journal, Karl Schellbach gave a method for 
converting the infinite products for sine and cosine into series of partial fractions. His 
method was based on the formal identity 


a B y r) 

a-a B-b y-c 5-d 

Sd oh = ! ae bes. (15.61) 

a-a (a—a)(B—b) (a—a)(P—b)y —c) 

To prove (15.61), Schellbach first observed that 

1 a a 1 ae a b 

(=a. “oASa? CaaS Se awa by- 
1 a b c 


14 


i=pi=nd=o i=c G=p0=h d=nasnd=o: 


and so on; to obtain (15.61), replace a by a b by Lap etc. Schellbach went on to note 
that the infinite product for sin x in (15.20) yielded 


[oe] x2 
in — = — 1 - —, 15.62 
sin 5 5 IT ( aa) (15.62) 


31 Schellbach (1854). 
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and taking x = 1 produced Wallis’s formula 


22 I (1- aa): (15.63) 


He next multiplied the reciprocal of (15.62) by (15.63) to arrive at 


mx 3 15 35 ees 
csc — ; es. : 
i Dy - Saga dy (=e ay 65 = Gea) 


Then, applying the partial fractions formula (15.61) to the infinite product on the 
right-hand side of (15.64), obtain 


HX el (2? — 17)(x? — 17) (= IN Ges IGT = ioe 
X CSC = t (2 = \(ae = x2) I (22 i x2)(42 _ x7) (62 ae, 2 


2 22 — x2 
ge td | 3 (27 — 17)(42 — 17) --- ((2n)? — 17) 


4— x? OIA Hy (Cro ax7) 
(15.65) 
Schellbach let x — 0 in (15.65) to derive the interesting formula 
2 igen (Ve dled a Pes 8 Aa eT 
a 2 BND24 5\2-4-6 7\2-4-6-8 
(15.66) 


In fact, (15.66) is a particular case of Gauss’s formula (17.65) for the sum of a 
hypergeometric series, but Schellbach’s derivation was elementary. 
In a manner similar to that used to derive (15.65), Schellbach obtained 


x2 thn = 1) 


TTX 
=14 : 15.67 
sec 5) 1— x2 1 — 3 (2 — . AG + 1)? = x?) ( ) 


Dividing equation (15.67) by x? and letting x —> 0, he obtained the famous formula 
of Euler: 


af Seat ae ia hte a (15.68) 


a formula for which we give several alternative proofs in Chapter 16. 

Schellbach discussed another method for converting products into series. This 
method is due to Euler, though Schellbach does not mention this fact; it is not clear 
whether or not he was aware of Euler’s work. Euler applied this method to convert 
the product [[°°, (1 — x”) into a series. In Chapter 25, we discuss this method 
and example; the approach depends, as Euler himself explained, on the algebraic 
identity 
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(l—a)U— 6) — vy) —4)--- 
=l—-a—fpl—a@)—yd—a)l—p)—-s1—-a)1—-pl—y) 


(15.69) 
Schellbach observed that since 
2 2 
UX ) x x 
cos = 1-29 (1-5) (1-5) 
it followed from (15.69) that 
2 2 2 
TX 2 2 XxX 2 Xx Xx 
COS =1-x d-<x 32 d (1-5) oo. (15.70) 
Similarly, since by (15.64) 
2 Dd 2 
ee be ee, 
2. 3 15 35 
he had 
__ x Le) BIRO Se) eA eRe a ae) 
sin =x4 
2 1-3 (1-3)2-5 (1-3-5)2-7 
f NO <9 (SI ae 
eas x*)( x“) x*)(6 2 ee (15.71) 


(1-3-5-7)2-9 


In the last section of his paper, Schellbach gave yet another method for obtaining 
series for trigonometric functions, a tentative method that he nevertheless endorsed 
on the grounds that it gave good approximations for specific values of trigonometric 
functions. However, he found two very nice formulas by this method, involving cos 7° 
and sin 5°. 

Regard the function cos ae At the points x = 0,+1,+2,+3,+4,+5,..., it takes 
the values 1, 5 5 1, 5, 5, ..., respectively; thus, the values repeat themselves. 
So Schellbach considered the series 


cos —— = Ay + Aox? + Agx?(x? — 17) + Agx?(x? — 12)(x? — 27) 


+ Asx? (x? — 1?)(x? — 2*)(@? — 3) +. 


and set x = 0,1,2,3,4, ... successively to find that 


1 
Ape Ae 


A, =1, A2=— ~ ar 


1 
WW 


He accordingly obtained the formula 
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1x OEP OSG? = 27) 
cos ai | 
3 2! 4! 6! 
Die = Ae = D2 
Pe -1)@?- 267-7) er 
8! 
and following a similar argument he had 
ax V3 x(x2—12) x(x? — 12) (x? — 2?) 
sin = x 
3 2 3! 5! 
2 f2\ (42 92y(2 9? 
x(x )(x )(x ) | -). (15.73) 


Now to prove (15.72) by Schellbach’s method amounts to showing that, when n is 
an integer, 


nw n= n(n? — 17) an (nr? — 17)--- (2? —@—1)) 
cos =1 eet (=i) 
3 2! 4! (2n)! 
(15.74) 
similarly, proving (15.73) requires us to prove that, when n is an integer, 
2 . nn n(n2—12) n(n? — 12)(n? — 2?) 
sin =n 
J3 3 3! 5! 
Bd ina SO a (eer 000) 
sees . (15.75 
ey) Qn! ( ) 


Formulas (15.74) and (15.75) can, of course, be verified, though Schellbach did not 
give the proofs in his paper. 

Note that Schellbach’s formula (15.72) is a particular case of (8.17), a formula most 
likely known to Newton. To see this, replace n by 2x and @ by %. Also observe that 
(15.73) may be obtained by taking the derivative of (8.17) with respect to 6 and then 
replacing n by 2x and 6 by €. Note also that if we take 6 = 4 in (8.16) and (8.17), 
we arrive at the series 


1x HOP TAY ee Pe 3") 
x 


sin = 
2 3! 5! 

2_ 42 2_ 32 2_ 52 

x(x (x (x eer (15.76) 
7! 
and 

2 Dp gids eye Df ye 2 2 Ae 

neeet Ag x cae 27): XE 2°)(x Colbie (15.77) 


3 2! 4! 6! 
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Glaisher olan eyed Jonna (15.72), (15.73), (15.76), and (15.77) to obtain nice 
formulas for 2? and 24: 32 


First, by equating sacmicieale of x? in (15.76), he had 


cea Ie ee i 1 1 
=. Lhe arses 
2n 2n+1 32 (2n — 1)? 


A=l1 


equating coefficients of x? in (15.73) gave him 
13 


=e n} (13 ! | | 7) 
V3 St) Gre) VP we) 


next, by equating coefficients of x* in (15.77), he got 


Be. wes DA si) 1 1 1 
7-=) Cn) 14 p++ 4 ; 
48 3-5---(Qn4+1)n+1 22 n2 


n=1 


finally, he equated coefficients of x* in (15.72) to obtain 


tee i 1 
oH =)5 Gan. as (1 byte t a). 


n=1 


15.11 Exercises 


(1) Show that the series for cos x can be obtained by a repeated integration of the 
equation cosx = | — . Je cos udu dt. This method of deriving the series 
for cosine is due to Leibniz. See Newton (1959-1960) vol. 2, p. 74. See also 
Chapter 1, where Madhava’s similar, but earlier, method is discussed. 

(2) Let n = 2m + 1. Show that 


=) . ( =) . ( =) 

sin| x + sin| x 

n n n 

: ( =) ; ( =) . ( =) 

+ Sin| x sin| x sin| x 4 Ped Ges 
n n n 
mit mi 

ssin( + “) ssin( — “) = 0, 
n n 


where the plus sign is used when m is even and the minus sign otherwise. See 
Euler (1988) p. 208. 


(3) With n as in the previous problem, prove Euler’s formula 


sin x — sin (x 


32 Glaisher (1878) p. 76. 
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oA aa 20 
ncscnx = csc x —csc| x 4 csc| x + csc{ x 4 
n n n 
20 mit mir 
t+ csc| x -+»tescl x + — ] £csc{ x — — ]. 
n n n 


See Euler (1988) p. 209. 
(4) Prove the following formulas: 


cosmnx = 2"! cos(x-+ "2 cos( esas ) 
= ——= x ———1 
n n 


( n—3 ) ( n—3 ) 
-cos| x + ——Z ] cos| x — ——z]--:-, 
n n 


where there are n factors; 


IN 20 
ncotnx =cotx 4 cot (x t ) + cot (x+=) 
n n 
(-+"*) 
+.+-+-+cot (x + —mT ]}. 
n 


Also show that the sum of the squares of the cotangents is —n. See 
Euler (1988) pp. 214 and 218. 


(5) Set Clausen’s function as Clo(@) = eras, Se First show that 


(sin x)2 


6 
t 
Ch(@) == -| log |2 sin al at. 
0 


Then show that Landen’s formula (16.91) can be correctly and comprehen- 
sively stated by Kummer’s 1840 result: 


Lin(re!”) = Lin(r,0) + sw log r + Clo(2w) + Clo(20) — Clo(2w + 26)], 


r sind 


where tan @ = 7—-<5,q> and 
1 f” log(l —2rcos@ +r? 
Lin(r,0) = / ee 
PI 0 r 


See Kummer (1840) pp. 74-90 and Lewin (1981) pp. 120-121. 
(6) Prove Kummer’s formula 


yy — ie = 
(= 2) — Ln(* “*) } Lin( y ) } Lin(“ =) 
yd —x) : | xy—y y—xy 
of ey L 2 
ETE + —(1 . 
n( =) 5 (in y) 


See Clausen (1832) and Kummer (1975) vol. 2, p. 238. 
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(7) Prove that Spence’s formula (15.55) and Abel’s formula (15.58) are 
equivalent. 
(8) Show that 


sinx  x(m—X +x\ (2m —x\ (20 +x)\ (3m —x\ (307 +x 
sny y\r-y +y)/\Qn —y/\Qn+y/\30 -y/ Bat+y ; 
Derive the product for cos x by replacing x by 5 — x, y by 4, and applying 


Wallis’s formula. See Cauchy’s Analyse algébrique, note IV. 
(9) Prove that 


—_ le 
Lio(x) + Lin(y) — Lin(xy) = Lin(“ = ») Lin(™ = ~) 


1-x l-y 
+ In In ‘ 
l—xy l—xy 
See L. J. Rogers (1907). 


(10) Prove the two inequalities for 0 < x < 4 used in Cauchy’s proof of (15.35): 


: x 
sinx <x and — <2. 
sin x 


See note IX of Cauchy’s Analyse algébrique. 
(11) Prove that for even m 


m 
z : sin? x 

cosmx = | | ~ 9 Qk) |? 
2m 


k=1 
; ; z sin? x 
sinmx =m sinx cosx I] 1 -—- —.~— |}. 
+ 2 kn 
sine = 
k=1 2m 


State and prove a similar formula for odd m. See Cauchy, Analyse algébrique, 
note IX. 
(12) Use the formulas in Exercise 11 and (15.35) to show that for m even 


e 
m 2k —1 
cosmx = 227! I] (cos 2x — cos a) , 
k=l ue 


m 


7 
: m_ q 
sinmx = 227! sin2x I] (cos 2x — COS 
k=2 


(2k — 1)x 
am). 


State and prove similar results for m odd. See Cauchy, Analyse algébrique, 
note IX. 
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(13) Suppose ¢o(x) = ¢0(+) and 


n(x) 


Prove that 


(a) don(x) — b2an() = 2 7429 Gon—2k—-1 (1) nx) 241, 
1X) = Gon41(4) = 2 Whig b2n—2K-41 (1) An x)*, 


(b) $2n4 


(c) o2n4 


d 
= | S610, n= 1,2,3,.... 
x 


+11) — b2n4 


1) = DCD eb on—2e1 A). 


Find ¢; (1), when ¢o(x) = oe tet Se Spence (1819) pp. 139-143. 


15 


x™4Latx—m* 


.12 Notes on the Literature 


Spence (1819), edited by John Herschel, contains Spence’s 1809 book on logarithmic 
transcendents. Lewin (1981) gives a modern treatment of the dilogarithm and poly- 
logarithm as functions of complex variables. 
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16.1 Preliminary Remarks 


One of the outstanding and most difficult mathematical problems of the early 
eighteenth century was the summation of the series )°7° 9 =. In his 1650 book 
Novae Quadraturae Arithmeticae Seu De Additione Fractionum, Pietro Mengoli 
(1626-1686) considered the sum of the reciprocals of figurate numbers: natural 
numbers, triangular numbers, square numbers, and so on.! For the natural numbers, 
Mengoli showed that the sum of their reciprocals diverged, or that the harmonic 
series was divergent. For triangular numbers, he showed how the reciprocal of each 
triangular number could be written as a difference of two fractions, thus summing the 
series. In the next step, square numbers, Mengoli posed the problem of summing their 
reciprocals, but could not solve it. He expressed surprise that the series of triangular 
reciprocals could be summed more easily than the series of square reciprocals, saying 
that a “richer intellect” would be required to solve this problem. 

Leibniz, Jakob and Johann Bernoulli, and James Stirling all later attempted to sum 
this series.” In fact, this question became known as the Basel problem because it 
frustrated the very best efforts of Jakob Bernoulli of Basel, who wrote that he would be 
greatly indebted to anyone who would send him a solution. Unfortunately, the solution 
was not found until thirty years after Bernoulli’s death in 1705; in a famous paper of 
1735, Euler became the first to sum this series.? With characteristic brilliance, Euler 
made use of the formulas relating the roots to the coefficients of algebraic equations 
and he boldly applied them to equations of infinite degree. When Euler communicated 
his results without proof to Stirling in 1736, the latter wrote in response that Euler 


= 


Mengoli (1650). 

Leibniz (1971) vol. 2, part 1, pp. 118-122. Bernoulli (1744) vol. 1, pp. 375-402, 517-542. Bernoulli (1742) 
vol. 4, pp. 19-25. Stirling (1730) pp. 28-29 or for an English translation of this letter, see Stirling and 
Tweddle (2003) p. 46. Johann Bernoulli (1742) presented this material in 1742 in a somewhat misleading 
manner, without mention of Euler, from whom he learned the key ideas. Indeed, for this reason, Spence had 
the impression that Johann Bernoulli was the original discoverer of the formula for the sums of even powers of 
the reciprocals of integers. 

Eu. I-14 pp. 73-86. E 41. 

For an English translation, see Tweddle (1988) pp. 141-144, especially p. 143. 
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must have tapped a new source, since the old methods were insufficient. Euler had 
indeed found a new source, beginning with the equation 


jerotg hoe artes (16.1) 
=1—-sinx= x vee 3 

y y 31° 5! 
where y was a constant between —1 and 1. He argued that x = 2nm + A and 


x = (2n+1)m — A gave acomplete list of the roots of (16.1), provided that sin A = y 
and that n assumed all possible integer values. Hence, he factored the right-hand side 


of (16.1) as 
(1 =) (1 =) (1 wo, (16.2) 
Al A A3 


where Aj, Az, A3,... comprised all the roots. He equated the coefficients of x to 
obtain the sum of the reciprocals of the roots: 


1 1 1 1 
ean (16.3) 
A; Ag A3 y 


In a similar manner, Euler argued, the sum of all the products of the reciprocals of 
the roots, taken two at a time, was equal to the coefficient of x” and was thus zero; the 
sum of the products of the reciprocals taken three at a time was equal to the negative 
of the coefficient of x? and was thus — 5m and so on. 

To obtain the sums of the squares, cubes, fourth powers, etc., of the reciprocals of 
the roots, he applied the Girard—Newton formulas. For a discussion of these formulas, 
see Section 13.3, particularly the material contained between equations (13.14) and 


(13.17). Thus, when y = | and A = ae Euler had the Madhava—Leibniz formula 


z : ‘ =1 (16.4) 
nm 3x 50 a , 
or 
| eet T 
1 | ea, 16.5 
3° 5 4 ee) 
Note here that the roots of order 2 of the equation sinx = 1 are x = 7 — xr 
ou --», Thus, each root occurs twice: Ay = 5, A2 = 5, A3 = “2, A4 = a and 


so on. This explains why there is a factor 4 in the numerators of the series in (16.4). 
From the Girard—Newton formulas, Euler obtained 


1 1 nr 
14 bee 16.6 
32° 52 8 Go) 
pee ae (16.7) 
33 53 ~ 32’ : 
1 1 a 
14 bees ge : (16.8) 
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He then observed, as Jakob Bernoulli had also seen,> that 


1 | ! | ! | ! | _ 1 | ! | ! | | ! 1 | 1 | 1 | 
T 22 T 32 T 42 T = T 32 T 52 T T 4 T 72 T 32 T 
and hence by (16.6) 
CO 
1 ie 
> aoe (16.9) 
n=1 


Similarly, Euler found 


cam | TU 
eae 16.1 
ae 90’ (16.10) 


and so on. To derive other series, he took y as constants other than one. For example, 


for y = FH and y = ae he had, respectively, 


ee ee ae ne ee ee ee) ibis 
i eR ae ie (| 
and 
fot EM A. ee 
BA 8 8s * 10) sald 3/3 
Observe that the roots of the equation sinx = a are 4,97, 2, - @...., 


nx 20 4a Sa 


Similarly, the roots gig eae of the equation sinx = 2 produce (16.12). 

Recall that Newton had earlier proved (16.11) by the integration of the rational 
function cae over [0, 1]. See Section 12.1 and equation (12.3). Indeed, Euler credited 
Newton with this result. Of course, Madhava and Leibniz found (16.5) by integrating 
the rational function aes In the same manner, (16.12) can by obtained by the 


I+ 


integration of aa Thus, integration of rational functions is a powerful method 
for evaluating many series that sum to multiples of z or to logarithms of numbers. 
However, this method is not as effective for series such as (16.6) through (16.10). 
Here Euler provided a new insight, so that one could efficiently sum up many series. 
When Euler communicated his method to some of his mathematical correspon- 
dents, there were objections to his procedure. How did he know, for example, that 
nz were the only roots of sinx = 0? It was possible that some roots were complex! 
How could he employ the Girard—Newton formulas, applicable to polynomials, for 
equations of infinite degree? In addition, there were also convergence questions 
concerning some of Euler’s series. But a year after writing his famous paper summing 


5 Bernoulli (1744) vol. 1, pp. 529-533. Stirling also discovered this independently. See his reply to Euler in 
Tweddle (1988) pp. 144-145. 
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the series }°P° 5 +r, Euler proved his product formula for the sine function,® justifying 
his contention that with n an integer, nz gave all the roots of the sine function, of 
which there were thus no complex roots. 

Euler was well aware of these inevitable objections, but he believed in the 
correctness of his formulas. His methods had succeeded in rederiving known formulas 
and, moreover, numerical methods such as the Euler—Maclaurin formula showed him 
that his results were correct to many decimal places. So Euler made great efforts to 
resolve any doubts about his method, as well as to prove his formulas using alternative 
procedures. For example, by proving the product formulas for sin x and cos x, (15.20) 
and (15.21), he showed that these functions had only the well-known zeros and no 
others. And in 1737 Euler gave an ingenious method for deriving (16.6) and (16.9) 
by computing ye es dx in two different ways. Unfortunately, this method did 


not extend to formulas such as (16.7), (16.8), and (16.10) where the powers of the 
denominators were greater than two. Meanwhile, in 1738, Niklaus I Bernoulli gave an 
extremely clever proof of (16.6) by squaring the series (16.5).’ Euler soon simplified 
the argument and in a letter of July 30, 1738, communicated the simplification to 
Johann Bernoulli.8 A few years later, Euler generalized the essential idea of the 
simplification and communicated it to his friend Goldbach as a theorem.? In his next 
letter of October 27, 1742, Euler explained with details how he had used that theorem 
to obtain (16.6) from (16.5). Goldbach responded to the theorem in two letters of 
February 1743, suggesting the problem of the summation of some new series, leading 
Euler to the study of series now known as double-zeta values. !° 

It also turned out that series methods could be used to evaluate }°7° , sr for 
k > 1. In particular, one could call upon the series of the polylogarithmic function 
defined by °°. , ae for complex x with |x| < 1. John Landen was the first to 
use a polylogarithmic function with complex values to sum this series, though his 
work shows that he did not fully grasp the difficulties connected with multivalued 
functions. Euler had a clearer idea of how to deal with such difficulties, though he did 
not work out the details of this method for the summation of )°°., —4, a derivation 


n 


first accomplished by William Spence.!! 

In a paper of 1743 published in Berlin, Euler used the partial fractions expansions of 
cot zx and csc mx and their derivatives to find the sum of }> +5 and related series. ! 
This is essentially the method often used in modern textbooks, although Euler, Daniel 
Bernoulli, and Landen found other significant proofs. In 1752, Euler discovered the 
“Fourier” expansion? 


Eu. I-14 pp. 3-6. E 61 § 7-10. 

Bernoulli (1738). 

Eu. IV A-2 pp. 230-247, especially p. 243. 

Fuss (1968) vol. I, pp. 144-153, especially p. 153. 
10 ibid. pp. 193-208. 

11 Spence (1809) § 31. 

12 Bu. I-17 pp. 1-34. E59. 

13 Bu. 1-14 pp. 542-584. E 246. 
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1 1 1 1 
sind 5 sin2¢ qnee q sind 5 sin5¢@ — etc. = . (16.13) 


yielding (16.5) when @ = 5. Integration of (16.13) gave him 


1 1 1 2 
cos d 7 cos 2¢ 4 32 cos 3g — qe cos 4 + etc. =C- e. (16.14) 


As we discuss i _ Seon 20.6, in 1773 Daniel Bernoulli showed that the values of 
> Gy? and )> OED IT =; could be obtained from (16.14). '4 Euler further improved 
on these results, and their joint work produced the Fourier expansions for Bernoulli 
polynomials, related to the polylogarithmic function 


ein? 


oe) 
yee. 
n=l 


: —1)" 
Note that in 1758 Landen evaluated the sums }°> or and >> ET by means of 


the polylogarithmic functions.!> To do this, he employed complex numbers through 
the use of log(—1). In the 1770s, Euler made further strides in this area, though his 
work had a few errors, especially where complex numbers were used. Thus, by the 
1770s, Euler had worked out many ways of evaluating )> . In his papers written 
during that period, he described his methods and even added a new one, employing 
integration and differentiation under the integral sign. 

From the start of his work on what we would now call zeta values, Euler 
observed that the numbers appearing in the values of )° +k = 1,2,3,... also 
presented themselves as coefficients in the Euler-Maclaurin summation formula. He 
was intrigued by this puzzle and wrote in1738 in his second letter to Stirling that an 
explanation of this would be a significant advancement.!® In a 1739 paper, finally 
published in 1750, Euler began to understand this, as he used differential equations 
to obtain the Taylor series expansions of cotwx and 4 xe" The coefficients of the 


series for cotzx involved the sums x “ae > On the see hand, the coefficients of 
ae involved the Bernoulli numbers and were the coefficients in the Euler—Maclaurin 
series. Thus, Euler made his own “significant advancement.” It was around 1740 that 
Euler more precisely understood the relation between the trigonometric functions and 
the exponential function, noting that 


; 2 xe* 2 
cotx =1 le ig] ’ wd rte ae | (16.15) 


gave a simpler reason for the appearance of Bernoulli numbers in the summation of 
> - He gave details in his differential calculus book of 1755.!7 


14 Bernoulli (1982-1996) vol. 2, pp. 119-134, especially pp. 120-121. 
15. Landen (1758). 

!6 For a translation of this letter, see Tweddle (1988) pp. 145-154. 

17 Bu. 1-109. E 212. 
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Euler was also mystified by the fact that, even though he could sum )° se or 


-yyrtl , : : : 
> ¢ s in various ways, he was unable to find the sum of the series with odd 


powers, >> ET: Of course, this is a problem that has baffled mathematicians to the 
present day. To shed some light on this question, Euler considered the divergent series 
y-(—1)"*!n*. In a 1739 paper,!® he noted that 


1 
1—1+1-—-141-ete. = 5° (16.16) 
(2733 4 ete; = 0, (16.17) 
1 — 224-1 4 32k-1 _ g2k-1 a gt, 
= 1-2---(2k—-1) 1 1 
= k-1 \ \ | 
= (12. (1 tet oe -), (16.18) 
k = 1,2,3,.... Note that the series in parentheses in (16.18) can also be written as 
ook = 1 Li ood 1 
ak 5 1 nae + 5e — Qae Tt etc. }. (16.19) 


Thus, the series with even/odd powers in the denominator were related to the series 
with odd/even powers in the numerator; however, the series with even powers in the 
numerator summed to zero (except for k = 0), and hence gave no information about 


the odd series )° a k = 1,2,3,.... In the exceptional case, for k = 0, one 


gets )° cy ee nie = In2. We observe that Euler was well aware that the 
series on the left-hand side were divergent, but he plunged right in anyway, since 
this work was yielding him insight into a very challenging problem. Indeed, he was 
justified in his audacity, since this approach led him to the functional equation for the 
zeta function. 


It appears that at some point in the 1740s, Euler started thinking of the series )> ma 


and other related series as particular values of the functions defined by ¢(s) = a 


etc. Here note, however, that the label ¢(s) was later given by Riemann. In a paper 
presented to the Berlin Academy in 1749 but published in 1768, Euler drew a relation 
between f(s) and ¢(1 — s), writing the equation, !° with m areal number: 


1 gm-l 3m qm-l gm—l 6”-i L ete. 
1-2-7 4+37-™—4-m45-™ _ 6” + ete. 
1-2:3---(m—-1Q"-1) mn 


a a ea cos : 
(2 Dx 2 


(16.20) 


In this way, Euler found a generalization of (16.18). 


18 & 130 § 30, Eu. I-14, pp. 407-462. See also E 212 § 185-187, Eu. 1-10, pp. 384-386. 
19 & 352, § 13. Eu. I-15 pp. 70-90, especially p. 83. 
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In 1826, Abel wrote his friend Holmboe that equation (16.17) was a laughable 
equation to write.7° Abel’s early training had been in the formal mathematical tradition 
of which Euler was considered the model. After he studied Cauchy’s lectures on 
analysis, Abel changed his view of mathematics; he believed it illegitimate to use 
divergent series at all and therefore wished to abolish formulas such as (16.17), 
(16.18), and (16.20). However, Euler had a very clear idea that the definition of the 
divergent series in these formulas amounted to a limit: 


P24 3" Sa fe Tim (Ll 2" es Be Sd os), (16.21) 
x17 
He found the values of these limits, for nonnegative integer values of n, by repeated 
multiplication by x followed by differentiation of the geometric series formula 


4 igh oe ! 
+x 


(16.22) 


Euler’s technique of summing (16.21) became an important summability method 
in the theory of divergent series developed after 1890. Ironically, it is called the 
Abel sum. 

Euler verified (16.20) for all integer values of m and for two fractional values, 
= 3. In general, he meant | - 2 -3---(m — 1) to stand for the gamma 
function [\(m). For positive integer values of m, he made illegitimate but successful 
use of the Euler—Maclaurin summation formula to sum the divergent series 


m = 4 andm 


x™ — (x +1)" 4+ (4 4+2)" — (4 +3)" 4+---. (16.23) 


By doing this, he could bring into play the Bernoulli numbers that appear in the 
Euler—Maclaurin summation, using them to evaluate the sums on the left-hand side of 
(16.21). In fact, he could have done this without using Euler—Maclaurin, had he first 
applied the change of variables x = e~” in (16.22). Euler believed that a divergent 
series, especially an alternating series, had a definite value, obtainable by varying 
methods. 

In order to verify (16.20) for m = 0, Euler used (16.16) and the series for In2. 
For negative integers, he noted that under the transformation m — 1 — m, both sides 
were converted into their reciprocals. For the right-hand side, he required the reflection 
formula for the gamma function '(m)P(1 — m) = =4_., and it appears that Euler 
explicitly stated this formula for the first time in this paper. Form = 5. Euler used 
the value r(5) = /z; he had known this value since 1729, but it also followed 
immediately from his reflection formula. Finally, for m = 3, Euler computed both 
sides to several decimal places, checking that the results were identical. To do this, he 
applied the Euler-Maclaurin formula to sum the divergent series 1 — /2 + /3 — /4 
+--+ = 0.380129, as well as the convergent series 


20 See Stubhaug (2000) pp. 343-344 or Ore (1974) p. 97. 
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1 1 1 
1 ada ia 0.765158. 
G. Faber, the editor of vol. 15 of Euler’s Opera Omnia in which this paper was 
reprinted, noted that the values could be expressed more exactly as 0.380105 and 
0.765147. 
At the end of his 1749 paper,?! Euler mentioned without proof the functional 
equation for a special L-function: 


1 n-1 n-1 qo tc. i eo es eee —1)2” 
ais Bee SS Eine. ibaa 
1—3-" +57" —7-* + ete. qa” 2 


These results on the functional relations for the zeta and L-functions apparently 
went unnoticed. In the 1840s, the functional equation (16.20) was given a complete 
proof for 0 < m < _ 1 by the Swedish mathematician Carl Johan Malmsten 
(1814—1886),?? who mentioned seeing (16.20) somewhere in Euler. Oscar Schlémilch 
independently found a proof and stated it as a problem in a journal in 1849; he gave 
a solution in 1858.73 A solution was published by Thomas Clausen (1801-1885) in 
18584 and another was noted by Eisenstein on the last blank page in his copy of 
the Disquisitiones in the French translation of 1807. André Weil conjectured that 
Eisenstein discussed this topic with Riemann,”° providing the impetus for Riemann’s 
well-known paper of 1859 on the zeta function. Indeed, Riemann and Eisenstein had 
been close friends in Berlin, and Eisenstein’s note is dated April 1849, just before 
Riemann left Berlin for Gottingen. 


16.2 Euler’s First Evaluation of )~ se 


Euler’s evaluation was based on the factorization given by (16.1) and (16.2): 


1 (x x3 x? hy x 1 x 1 x 
yk, VLes * PSS ~ Al A2 A3 


(16.25) 


Note that Euler wrote A, B, C, D,... instead of Aj, Az, A3, Ag, .... He observed 
that the coefficients of the powers of x were the elementary symmetric functions of 


the infinitely many variables ree aes a .... Therefore, by equating coefficients, 
he had 
o=acy '"Lagt? ’* Lane 
4 Ai’ ae So AAJAR by 
J i<j< 


(16.26) 


21 ibid. § 20. 

22 Malmsten (1849). 

23 Schlémilch (1849) and (1858). 
24 Clausen (1858). 

25 Weil (1989a). 
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where a, 6, y, 6, etc. denoted the elementary symmetric functions. Euler then applied 
the Girard—Newton formulas (13.15) connecting the sums of the squares, cubes, fourth 


powers, etc., of ree ree a ... with the symmetric functions a, 6, y, etc. Thus, he got 


1 2 y 3 
— =a’ — 28, =a” —3aB + 3y, 


1 
Daa = at — 4a7B + 4ay + 2p? — 46, ete. 
i 


For y = 1, the roots A;, Az, A3,... of the equation sinx = 1 were 


u 1 3 3a 5a 5x 

IDs 2 Qe Di 
since sinx — | = 0 had double roots. Thus, Euler obtained equations (16.6), (16.7), 
and (16.8). Clearly, he could continue the calculations to arbitrarily large powers of the 
reciprocals of the roots. Euler explicitly wrote the values of )° rr fork = 1,2,...,6, 
and the last of these turned out to be 


Tee as Be cu 2, 2 Ole 
gi2 ' 312 ° 412 ~ 6825 -93555— 


14 bse (16.27) 
The appearance of the fairly large prime 691 may have alerted Euler to the connection 
of zeta values with Bernoulli numbers. Recall that this prime had already appeared 
in the Euler—Maclaurin series he had found only two or three years earlier, and at 
the time he discovered (16.27), he was still intensely studying the Euler—Maclaurin 


summation. 


16.3 Euler: Bernoulli Numbers and )°* (Oe 


In a paper presented to the Petersburg Academy in October 1739, but published in 
1750,7° Euler explained the connection between the Bernoulli numbers appearing in 
the Euler—Maclaurin formula and the sums )> a: A year earlier, he had found the 
partial fractions expansion of cot x and he made use of this in his explanation. Euler 
started with a generating function for the sums )° a and changed the order of sum- 
mation to obtain a partial fractions expansion that he could recognize as a cotangent 
function. Denoting the generating function by S, he had by a rearrangement of terms 


= cea | cael 
aa 24 4, 6 | 
s=(o5)?+(La t+ (oa) ot 
n=1 n=1 n=1 
2 4 6 2 4 6 
Be Nees ad oe ara kcal ees eee cig ee 
ae! 94 BB 32° 34 ' 36 
_ x? x2 x2 ! UX (16.28) 
~ fox? © 92 — x2 5 32 42 | 2 2tanawx’ : 


26 Bu. I-14 pp. 407-462. E 130 §§ 22-24. 
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At this point, Euler could have used equation (16.15) to derive the value of )> ar 
in terms of Bernoulli numbers. By expressing tanzx in terms of the exponential 
function, he would have obtained 


Oni 21x 
Eine Pee aero ey (16.29) 
tan 7x e2mix _ | 


He next could have used his generating function for the Bernoulli numbers appearing 
in the Euler—Maclaurin formula to express the right-hand side as 


1 ae n—-1 Bon 2n 
2 2 1) ny 27) ; 


In 1740 Euler was just beginning to delve into the connection between the circular 
and exponential functions; he was not yet ready to make full use of it. For example, in 
a letter to Johann Bernoulli written during this period,’ he explained the equality of 
2cos x and e'* +e7'* by means of differential equations. Similarly, in his 1750 paper, 
he proved (16.29) through the use of differential equations. Thus, Euler continued his 
argument, proceeding to define A, B, C,... by A = + x ma B= + » = etc. 
Since 


1 
s=5(I = Ns u = arctan 


~ tanax 1-—2S’ 


where u = mx, a simple calculation showed Euler that S satisfied 


dS 
Qu— +28 = u* +4582. 


du 
He substituted the series § = Au? + Bu*+ + Cu® + Du’ +--- into the differential 
equation and determined that 
1 2A? 4AB 
A=-+, B=—, C=—_, 
6 5 7 
AC + 2B? 4AD + BC 
ae” Gee ee (16.30) 
9 11 
He then observed?® that the coefficients in the Euler-Maclaurin series were 
generated by 
xe* 2 4 
s= =l+ax+ Bx* +yx° + 6x" + ete. 
ex —] 


and saw that s satisfied the differential equation 


d 
Pca eee ey 


dx 


27 Bu. IV-2 p. 392. 
28 Eu. I-14 pp. 407-462, especially §§ 27-28. E 130. 
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By substituting the series for s, he obtained relations for the coefficients a, B, 
y, 6, etc. He noted that except fora = 5. the coefficients of the odd powers were 


zero. To see this more easily, the reader may consider the fact that —5 + ae is an 


even function. Euler next set 6 = 4 6= 5, C= 
8 


5,0 = ox = 55, etc., where 


¢, 0, x denoted the coefficients of x®, x8 x10 respectively. He then showed that these 
A, B, C,... also satisfied the relations (16.30). Thus, Euler had the formula we now 
write as 

1 1 Q2n— 1 an 


I 
¢(2n) = 14 7 a 1 a Gn! Bon. (16.31) 


16.4 Euler’s Evaluation of Some L-Series Values by Partial Fractions 


Euler’s essential idea in the derivation of his famous zeta value formula, proved 


in the last section, was the partial fractions expansion of ;*—. After his move to 


Berlin in 1741, Euler followed this up with a paper in the Berlin Academy journal of 
1743.7? There he showed how the same partial fractions could also be applied to the 
derivation of several L-series values. See Chapter 28 for a definition of L-series. Euler 
started with 


Ho . 1 1 1 1 1 1 1 ; (16.32) 
snst s 1l—s l1l+s 2-s '2+s'°3-s 34s oe , 

I 1 1 1 1 1 1 1 
= | tc., (16. 

tan sz KY l-s Il+s 2-s 2+5 3-58 345 Ee G28) 


where he took s to be a rational number s = 4. He assigned specific integer values to 
p and q and evaluated several series, including (16.5), (16.11), and (16.12). 

To get the series for the squares of the partial fractions, Euler took the derivatives 
of (16.32) and (16.33) to obtain 


m*cosas 1 1 1 1 1 1 : 
= etc. 
(sinzs)?  s2 (1—s)* (1+s)? (2-s)? (+5)? @G-s)? 
(16.34) 
Ge 1 1 1 1 1 1 : 
= + etc. 
(sinazx)? s2? (l—s)? (l+s)? (Q-s)?) (Q4+5)%2 (G-s)? 
(16.35) 
Among examples of these relations, for s = + Euler gave 
Da? ie Gilet od Ade 1 
an 1 wm ptata- Rip etc. (16.36) 
An? ete a The se Were ee Hh a 
7 =14 2 T 42 52 72 T Q2 T 102 etc. (16.37) 


29 Bu. 1-14 pp. 138-155, E61 §§ 18-20. 
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and for s = i in (16.34) he obtained 


2 
a 1 1 1 1 1 1 
=1 + etc. 16.38 
8/2 S20 eh pe Oe UR age re Ree 
He knew that he could obtain (16.37) directly from (16.9). Near the end of the 
paper he noted that if P and Q denoted the left-hand sides of (16.32) and (16.33), 
respectively, then 


(-1)""!d""1!P 1 page _ ib 1 oe 
1)! dst-! ~ a t ( ) 1 n 1 n + (-)) 9) n 
(n= 1)! ds 5 G=sy (+s) Qs)" 14639) 
+ Oisy + (-1) Gos) etc., 
—jyr-! q?-! 1 1 1 
( ) = =— ( i ans ~ - ( 1B an - 
(n—1)! ds Ss (1 —s) (1+s) (2—s) 
: Ege yer : = + ete. 
(2+) 3—s) 
(16.40) 


16.5 Euler’s Evaluation of >° > by Integration 


Because some mathematicians raised objections to his first evaluation of }> a Euler 
looked for other methods. His evaluations by partial fractions were immune to these 
objections, but even before finding the partial fractions method, Euler discovered an 
ingenious technique using integration. In 1737, Euler worked out and communicated 
to Johann Bernoulli his integration method.*° But the paper containing this method 
appeared in 1743 in the Journal littéraire d’Allemagne.*! It was unusual for Euler 
to use this journal; consequently, the paper received very scant notice. It was finally 
reprinted in the 1907-1908 Bibliotheca Mathematica as a forgotten work of Euler.°? 
However, Euler’s evaluation of >> Ea was presented in a 1743 calculus book by 


Simpson, although Simpson did not mention Euler.** 
Briefly and in modern notation, Euler started with Newton’s series 
: 1x? 1-3x° lex tee u 
arcsinx = x 4 t } fe eae 


23 2:45 2-4-67 


to get 


he i? is arcsin t r i (: 1 ) dt (16.41) 
—(arcsinx)* = = t poses ‘ 3 
2 0 V¥1—?2 0 23 1-72 


30 Eu. IV A-2 pp. 161-175, especially pp. 170-171. 
31 Bu. 1-14 pp. 177-186. E 63. 

32 Stickel (1908). 

33 Simpson (1743) pp. 140-141. 
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Assuming n was odd, since only odd powers appeared in the series, integration by 


parts gave him 
1 ynt2 n+1 1 x? 
———dx = dx 
0 Vi-x2 nt+2Jyo VI—-x? 


_ Mt D@=DO=3)--2 
~  @F DMM =2--3 | 


Euler applied this to (16.41) with x = 1 to obtain 


(16.42) 


Unfortunately, this method could not be extended to )° a for k = 2,3,..., 


though Euler attempted it. For example, he considered the series for (arcsin x)? 
divided by V1 — x? and integrated over (0,1). After some similar calculations, he 
obtained 


W 
48° 22 2 ae 2 9 
A 2 
a result equivalent to > = >? s. 
In a letter dated August 27, 1737,°* of the Russian or Julian calendar, Euler 
communicated to his former teacher, Johann Bernoulli, the evaluation of }> J, by 


means of (16.41). In his reply of November 6 (Gregorian), Bernoulli expressed his 
admiration for this method and observed that it had led him to find a new series 


2 35 
for ve 
dy ot IR > eh Sle, 1. DRAG. 2 RCH ABBE 4 3 ee 
(300 Pesed- Bese Ue Sees7sR. Pa 565 720.10. 8 


(16.43) 


We note that Bernoulli wrote C instead of zr. Since the Russian calendar at that time 
was about ten days behind the Gregorian calendar, keeping track of correspondence 
can be challenging. 

Bernoulli gave no further details in his letter, but in 1742 he offered an expla- 
nation in the fourth volume of his Opera Omnia.*°© His method was to divide 
Newton’s transformed series for arctan t, (10.4), written up in the De Computo, by 
1 + ¢? and then integrate. Recall that Newton had not published his work, so the 
alternative series for arctant in powers of ae was Bernoulli’s rediscovery. Thus, 
Bernoulli had 


34 Bu. IV A-2, pp. 170-171. 
35 ibid. pp. 185-187. 
36 Bernoulli (1743) vol. 4, p. 25. 
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(arctan t)? arctan ft 
= dt 
2 1+ 7? 
-| t | 2r° | 247° | a 
~ (+12)? ° 1-3-(472)3 ' 1-3-50422)4 © 


(16.44) 


He obtained (16.43) by integrating this formula over (0,00). Concerning (16.43), 
Euler noted in a letter dated December 10, 1737 (Julian), that he had found the more 
general series:*” 


1 2 2.x4 2.4.6 2.4.6-x8 
=(aresin x)? = 2 | . | ts | os Fe». (16.45) 
2 1-2 1-3-4 1-3-5-6 1-3-5-7-8 


Euler went on to remark that Bernoulli’s formula followed from this by taking 


x = 1. Moreover, he observed that other interesting series would result by taking 
pe v3 
x= oe Wo or 7 
In his 1743 paper, Euler gave a derivation of (16.45) by observing that (arcsin x)? 
satisfied the second-order linear differential equation 


1 20) 16.4 
( ry) ae 0 (16.46) 


He then solved this equation by infinite series to prove (16.45). Observe, however, 
that Bernoulli could have obtained Euler’s formula (16.45) from his own (16.44): 
Rewrite (16.44) as 


1 2 < arctan t 
= (arctan z)“ = —~ dt 
2 0 


14+ 22 
Lat pe 2 tdt 
_ 4 (2n +1)! 1472) (1417)? 
oe) peas — se 
=e, 
7 22" (n1)2 72 n+l 
= ; One D! (, =) ; (16.47) 


Now set x? = Ze = tan(arcsin x) and (16.45) follows. 

We here note the remarkable fact that the Japanese mathematician Takebe Katahiro 
(1664-1739) published both Bernoulli’s (16.43) and Euler’s (16.45) series in his 1722 
treatise, Tetsujutsu Sankei.°* Takebe’s approach was very different, since he apparently 
made his discoveries of these series on the basis of a considerable amount of numerical 
work. He related the length of an arc of a circle determined by a chord to the height 


37 Bu. IV A-2 pp. 191-198, especially p. 196. 
38 Smith and Mikami (1914) pp. 148-153. 


410 Zeta Values 


of the chord. The latter would be the distance between the midpoint of the arc and the 
midpoint of the chord. After finding the series, Takebe sought an analytic justification 
for it. The Tetsujutsu exerted great influence on the development of mathematics 
in eighteenth-century Japan,*? spurring Japanese mathematicians to discover other 
series ‘for zr. 


: . 1 
16.6 N. Bernoulli’s Evaluation of >> Grip? 


Euler was eager to find many different evaluations of )> oa and in this he was assisted 


by Niklaus I Bernoulli who in 1738 published a very interesting method”? by squaring 
the Madhava—Leibniz series for 7, given by (16.5). Bernoulli’s derivation involved 
many transformations of series but in a July 1738 letter to Johann Bernoulli, Euler gave 
a shorter proof by greatly simplifying the second portion.*! We present this simplified 
proof, whose fundamental idea remained Bernoulli’s. 

Bernoulli first observed that by squaring equation (16.5) he had 


2 | Ge oe 1 | 1 
16 Ca a a er 2n+3 


n=0 n=0 
CO CO 
1 1 1 1 
12 ; 2 ; fees 
ares 2n+5 ara 2n+7 
(16.48) 
The first series on the right was the sum of the squares of 1, i - .... The other series 


were the sums of the mixed terms obtained by squaring. He then noted that 
I 1 —f 1 1 
2 = = = 
Brier aS rewire gl 
1 1 1 1 1 1 1 
2 . = = 1 ’ 
aE 1 2n+5 pees 2n =) 5 5) 


and so on. Hence, 


mos 1 1 1) 1 td 
16 = 4 Gat p (: (145) +5(145+5) =): ees 


n=0 


39 Ogawa and Morimoto (2018). 
40 Bernoulli (1738). 
41 Bu. IV A-2 pp. 230-247, especially pp. 242-243. 
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Euler’s simplification took effect at this point. To sum the series within the 
parentheses in (16.49), he observed that 


BISA yah Pure os wie se 
Lhe a 205 FZ 


(16.50) 


Euler then integrated over (0,1) to obtain 


1 ae ae ee EY d ie 
—(arctan 1)“ = = 14 14 
2 2\4 2° A 3)" 6 3° 5 


Thus, he showed that the series within parentheses in (16.49) was equal to 


x . 
16° 
> oe = a as was required. Not only did Euler simplify N. Bernoulli’s proof but 
he also obtained a more general result. This result and the inspiration and assistance 
of his friend Christian Goldbach eventually lead him to a fruitful study of double zeta 


values. 


hence, 


16.7 Euler and Goldbach: Double Zeta Values 


The route toward the consideration of double zeta values started with a theorem 
communicated by Euler to Goldbach on August 28, 1742.47 Note that this is a 
generalization of the last equation in N. Bernoulli’s evaluation, (16.49). Euler’s 
theorem states that if 


a aa a> rie 


=14 fee, 16.51 
: n+l 2n+1 3n+1 4n+1 ( ) 


then 
ss 1 a ia 1 _ aa 1 1 1 
Oe pa pee AD n+1.22+1 

a? F 1 1 1 

3n+2 n+1 2n+1 3n+1 


(aa Par : Pas V4 pie G39) 
4n+2 n+l 2n+1 3n+1 °° 4n4+1 phe : 


Goldbach was intrigued by this result and raised some questions about it in a letter 
dated October 1, 1742.43 Euler responded by explaining how Niklaus I Bernoulli’s 
result, discussed in the previous section, could be obtained from the theorem. This led 


42 Fuss (1968) vol. 1, pp. 144-153, especially p. 153. 
43. ibid. pp. 154-159, especially p. 156. 
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Goldbach to consider the series now known as double zeta values. In a letter to Euler 
of December 24, 1742, Goldbach wrote that the type of series in Euler’s theorem 
suggested the study of the series 


| 1 | 1 | i | i | 1 | ! | 1 | 1 | i | 
1 T Qn 1 T am T 3n 1 T zm T 3m T qn 1 T am T 3m T qm Wipe 


(16.53) 


In fact, Goldbach wrote only examples of such series form = 1,m = 1;n =2,m = 1; 
n = 1,m = 2. He also considered alternating series. 

Let us denote this series by €G(n,m). In modern terminology, this is almost the 
double zeta value defined by 


1 1 1 1 1 1 
é(n,m) = 14 14 14 ity 
Qn" 3n 2m) an am * 3m 


= CG(n,m) — f(m +n). (16.54) 


Goldbach further wrote that he had found 


197° 


4 
us 

)=—= d 2 1 4,2) = ———_: 

$63, 1) 7? an tG(5,1) + SG (4,2) 775.7038 


(16.55) 

He also mentioned that, while he could not evaluate ¢g(n,m), he could handle 
fg(n,m) + ¢G(m,n). Euler must have been greatly fascinated by these series, for on 
January 5, 1743, he responded with details of the proof of his theorem and then two 


weeks later gave a number of evaluations of particular cases of Goldbach’s series.*° 
Thus, he had 
1 1 
663.1) = 56Q)s 666.) = 6 QEA — 56): 
1 
SG(7, 1) = F(2)6(6) — F3)ES) + 564) 
1 
6609, 1) = F(2)6(8) — F(3)E(7) + F(4)E(6) — 566)" 
1 1 1 
66(2,2) = 56) tae te4,2) = (¢3))* - 35 (6); 
3 1 
$G (6,2) = 26(3)5(5) — 564)" + 768): 
4 1 
6G (8,2) = 26(3)E(7) — 36(4)5 (6) + 565)" = 55 (10); 
3,3) = ~(£(3))? + ~£(6): 
fG.2) ZEB) et 50) 
oe a 
6G(5,3) = 5 EO) = 3° 8) = 16.3.95-7" (16.56) 


44 ibid. pp. 172-175, especially p. 175. 
45 ibid. pp. 188-192, especially pp. 189-190. 
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Euler showed how Goldbach’s results (16.55) could be derived from these values 
and, concerning Goldbach’s remark on ¢G(m,n,) + g(n,m), he observed that 


S(m)o(n) + S(m +n) = CG(m,n) + SG(n,m). (16.57) 


Presumably, Goldbach had also found this elementary but basic formula now 
written as 


E(m)o(n) — (m +n) = E(m,n) + C(n,m). (16.58) 


In a letter of February 26, 1743,*° Euler explained that his results depended on the 
partial fractions identity: 


1 _ Ao Alo Sets Bo By 22. Baal 
x™(x+ajr xm" ym-1 | "x | (xtay* (etal! ' aa 
(16.59) 

where 


_ Din + Ie @tk= 1, p, = crm + Dm k= D 


Ak klantk klamtk 


(16.60) 


The identity (16.59) was needed when he considered the series 


C(m)o(n) — Cm +n) = DI Pregesc Dar eat (16.61) 


x=1 a=1 x=1 a=1 


Needless to say, Euler did not use this notation in his letter or in the paper he wrote 
up more than three decades later, in 1775. He wrote several terms of the expressions 
in (16.59) and (16.61) to make it clear how the series progressed. In his 1775 paper,*” 
Euler noted that (16.59) could be obtained by the method he had presented in his 
Introductio of 1748. Here note that Euler’s notation for ¢(s) and ¢g(m,n) used the 
integral sign for summation: 


1 1 1 
c= fs. comm) = f= (=). (16.62) 
x! zm \ y 


To evaluate (16.61), Euler started with the simple observation that for a positive 
integer a 


ae 1 1 
a eae: ae. 7 (16.63) 


46 ibid. pp. 200-208, especially p. 204. 
47 Buy. 1-15 pp. 217-267. E477. 
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He then applied (16.59) and (16.60) to transform the series s: 


ca SEE: 


[o,2) [o,2) 
n 1 n(n+ 1) 1 
qn ye zin qitl > m1 


zin— i 2! qnt2 zin—2 
z=1 z=1 z=1 ~ 


1 oe) 


1 Mowe od mm+l1) a 1 
Gly’ le 


| 

{ foves 
- 2! qmt2 m—2 

z=1 ~ z=l z z=l 4 


eee mea il mm+tl) 1 
+1 I i | 
“a (=e (3 dX kn qmtl dX kn-l Ss fm—2 rere ye 
=] =1 


2! qmt2 
a am 


(16.64) 


By summing over a, he then obtained an expression for the first series on the right- 
hand side of (16.61): 


5 it 
ae 


x=! a=1 (x +a)" 


oe) 


n(n+ 1) 
= C(n)g(m) — ng(n + 1)g(m — 1) 4 T &(n + 2)E(m — 2) — 


a m(m + 1) 
t)) (comeon +ime(m + 1)g(n — 1) rT $(m + 2)g(n — 2) 4 ) 


Pee | m(m-+ 1) 
ri) fc(m,n) + mog(m + 1,n— 1) 7 fa(m +2,n — 2)4 


(16.65) 


By interchanging m and n, Euler immediately got the formula for }> + S aa 
Thus, he had established all the formulas necessary to evaluate the results in (16.56). 
Euler evaluated several specific examples in his paper. He took m + n = 3 with 


m = 2,n = 1 and by applying (16.65) and using an analogous formula with m and n 
interchanged, he got 


$(2)E(1) — €(3) 


= 26(2)C(1) — 66,1) — 2626) + SE60,2) + S621) 
= ¢c(1,2). (16.66) 
Then by (16.57), he obtained 


6(2)6(1) + 63) = S6(2,1) + SG (1,2). (16.67) 
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Again, by subtracting (16.66) from (16.67), Euler could conclude that 
SG (2,1) = 2¢(3), (16.68) 
or in modern notation for the double-zeta function: 
(2,1) = ¢(3). (16.69) 


Observe that series ¢(1) and ¢(1,2) are both logarithmically divergent, though it is 
possible to suitably modify Euler’s argument to avoid the use of divergent series. To 
compute fc (4, 1), Euler took m = 4,n = | to get, after simplification, 


$6 (2,3) + 66(3,2) — 664, 1) = 26(2)6(3) — 2605). 


He then took m = 2,n = 3 in (16.57) to obtain 


§(2)6(3) + €(5) = S6(2,3) + 6G, 2). (16.70) 


Combined with the previous equation, this gave €G (4, 1) = 3¢(5) —(2)¢(3). Euler 
noted that he was unable to obtain ¢G(2,3) and ¢g(3,2) by this method. When he 
set m = 3,n = 2 in (16.65), the result was once again (16.70). Euler effectively 
remedied this drawback by developing a new algebraic method in the later part of 
the paper. 

The Euler-Goldbach double-zeta values can be generalized to multizeta values, 
defined as 


£(51,52,..-,5.) = » a ee (16.71) 
nj>n2>:-ngp>O my 3 7 My 
where s1,52...,5, are natural numbers. These values have been found to have 
connections with basic objects in number theory, algebraic geometry, and topology. 
They have been studied by the use of methods from combinatorics, real and complex 
analysis, algebra, and number theory, creating great current interest in this topic. 
Christian Goldbach, who was the first to see the potential for double-zeta series, 
was a Prussian but moved to Russia in the 1720s and remained there until his death. In 
1725 he became secretary of conferences at the Petersburg Academy. He was a man 
of diverse talents and his chief hobbies were languages, number theory, differential 
calculus, and infinite series. He was one of the educators of the young Tsar Peter 
(1715-1730). As we have seen, Euler frequently communicated important results 
to his friend Goldbach, who proposed problems of possible interest to Euler and 
made helpful comments. Goldbach informed Euler of Fermat’s unproved theorems 
and proposed the well-known Goldbach conjecture. Thus, he succeeded in directing 
Euler’s extraordinary talents toward the field of number theory, in which Euler’s 
other colleagues had minimal interest before Lagrange entered the scene. The first 
volume of P. Fuss’s Correspondance mathématique et physique de quelques célébres 
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géometres du XVIilléme siécle contains 176 letters written by Euler and Goldbach to 
each other over more than thirty-four years. Euler apparently regarded Goldbach as his 
close friend, writing him an urgent letter in 1738 when his eyesight was threatened. 
Goldbach then made unsuccessful attempts to relieve his friend of his burdensome 
responsibilities in geographical studies. We note that it was in 1766, when Euler 
returned to St. Petersburg after a twenty-five year stay in Berlin, that he became blind 
for all practical purposes. 


16.8 Secant and Tangent Numbers and ¢ (2m) 


The evaluation of the series 


oe) 


1 
¢(2m) => om (16.72) 


n= 


was a major problem faced by early eighteenth-century mathematicians. First 
Mengoli*® and then Jakob Bernoulli‘? attempted to sum the series for m = 1, 
but they were not successful. In 1735, Euler became the first to evaluate the sum 
(16.72).>° Eventually, he gave more than six methods of summing the series exactly, 
discussed in this chapter and elsewhere. One of his methods was to apply the geometric 
series to the partial fractions expansions of the trigonometric functions given in, or 
example, (15.42) through (15.45).>! Recall that the geometric series formula can 


be written 
1 1 1 ae 
= — 14 
a-x aIl- * a a 
1 x x? 
— Poa ae ; <i. (16.73) 
a a a 


ay and this can be found by taking 


the derivative with respect to x of the equation (16.73). Thus we have 


d a 9 d 1 _ 2% ite 
—(a—-x) = (a— x) = a 3 acts 


Observe that (15.44) requires the series for 


dx dx \a a 
1 2 3 4 x 
od ee, 1. (16.74 
a a 7 at ‘ 7 = ( ) 


48 Mengoli (1650). 

49 Jakob Bernoulli (1744) vol. I, pp. 377-399, epecially p. 398. 

50 Eu. L-14 pp. 73-86. E41. 

51 See, for example, Eu. I-10 § 124, in part 2 of his 1755 book on differential calculus. 
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Euler could then write x times the right-hand side of (15.43) as 
2 ae 2 2 
x x x x 
(2) Sr 


oS gk OO 2k OO 2k OO 2k 
ae 2 er 2) om 2) + i ee 
k=1 k=1 k=1 k=1 


and, by changing the order of summation, 


CO 
=(1-20 5s 2a ~). (16.75) 
n=1 


Now if the left-hand side of (16.75) can be independently expressed as a power 
series, then we can find )°°° | xr by equating the coefficient of x7 on each side. 
Euler achieved this step by using his generating function for Bernoulli numbers (2.36): 


t B B B 
= B+ t+ — P+ 4 
e—1 1! 2! 3! 
=1 5 T 7 Al fe++, |t| < 27, (16.76) 
since Bz = Bs = B7 = --- = 0. Recall that the odd Bernoulli numbers are all zero 
because 
t a t 
e—-1 2 


is an even function. Now x times the left-hand side of (15.43) produces 2x cot zx; 
(16.76) and (15.17) then produce 


etx 4 | 27 xi 


Wx cotTWX = TWXI | = ITXl + eux 4 


Boe yk 2k 
ap CDC, el <1. (16.77) 


Me 


ll 
= 


k 


We point out that Euler treated series without too much concern for convergence 
conditions and thus did not state the condition |x| < 1. However, his reasoning shows 
that he could easily have proved that the series converged for |x| < 1. The explicit 
treatment of convergence and divergence was at that time still developing. 

Next, equate the coefficient of x?* in (16.75) and (16.77) to arrive at 


Ce 2k—1 _2k 
1 x) Bo2 14 
=(-1 : 16.78 
di (-1) Goi (16.78) 


thus completing one of Euler’s proofs of (16.78). 
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Euler used /—1 in an exponent a few times in his Jntroductio, but most of the time 
he avoided using /—I in an exponent. Instead, he usually wrote cos x + /—I sinx. 
We summarize his derivation of (16.77), given in his work on differential calculus.>* 
He first used (2.36) to show that 


x(e? +e?) x2 x4 x® 
=1+8 + B. +t B poses, 16.79 
ek —e-3 te a eG Vel?) 


noting that when the exponentials were expanded in a series, then the left-hand side 
could be written as 


a sae a6 .8 
192 194 
cA saa (16.80) 
(ee FA os, 
3123 5125 


When Euler changed x? to —x? in (16.80), the series in the numerator became 


in X 
cos 5 and the series in the denominator became a. Moreover, the series in (16.79) 
changed to 


Thus, after replacing x by zx, he had (16.77). 
Now in order to express csc x, appearing on the left-hand side of (15.42), as a Taylor 
series, Euler observed that?? 


Xx 
csc x = cot 3 —cotx 


and then used (16.77) to obtain 


Bo, (22 — 2) 12k 
axesemx =1 ow yk-1 Bax ee xk, (16.81) 
k=1 


Next, expanding the right-hand side of (15.42) as a geometric series and then 
multiplying by x gave him 


1420 (¥ eu i oo) (16.82) 
nok ' : 
n=1 


Equating the coefficients of x7* in (16.81) and (16.82), Euler arrived at 


oO (-1)"7! 7 (-1)"1 Box (27k! — La 
aE = ei (16.83) 


n=1 


52 Bu. 1-109 § 113-127. E 212. 
53 Bu. I-10 § 223. 
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Although Euler was well aware that (16.83) could be obtained directly from 
(16.78), he here wished to show that the alternating series could be obtained from 
the partial fractions expansion of csc 1x. 

He was able to obtain the Taylor expansion of tanx by using the fact that 
tanx = cotx — cot2x: 


oo k-1 2k (92k 2k—1 
19k! Boy 22k (22k — 1 
tanx =) oh Uae (16.84) 
k=1 


2k “Qk—D! 
2k—-1 


The coefficients Oka pr in (16.84) are now called tangent numbers, denoted by Tx. 


Thus 


pal Box 92k (22k =) 
2k ; 


Tk = (-1) (16.85) 
Euler did not appear to have noted the fact that tangent numbers were integers. 
But in section 222 of the second part of his differential calculus book,** he explicitly 
noted that 
2Ax 23 Bx? 2 Cx 27 Dx! 


tanx = } } t + etc., 
ie ae eee ee eee aa ee 


where A, B,C, D were integers, as he had shown in section 182. Thus, he actually had 
2(27* — 1) Box = integer. (16.86) 


Based on the manner in which he treated the Taylor series for secx, it would 
certainly appear that Euler could have proved that 7;, as given by (16.85), was an 
integer. 

Euler also found a series expression for sec x, expressed here with subscripts, 
though that was not his notation: 


1 E E 
secx = = Eo- 24 ee 
cos x 2! 4! 
so he could then obtain 

E> E4 ao x4 
(£0. Te pte \(I- Sto =ae (16.87) 
To determine Eo, F2, £4,..., he equated coefficients of powers of x to obtain 
equations for E2,, k = 0,1,2,..., to find that the numbers were integers. These 


numbers are now called secant numbers; in section 224,>> Euler accurately calculated 
them to be 


54 Bu. 1-10) § 222. E212. 
55 ibid § 224. 
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Eo =1, Eo =1, Eg =5S, Eo = 61, Eg = 1385, Ey9 = 50521, Ey2 = 2702765, 
E\4 = 199360981, Ej6 = 19391512145, Eig = 2404879661671, 


We may imagine that such complex calculations must have afforded the mathemati- 
cians of the past not just skill, but very deep insight into the inner workings of their 
formulas. 

That E>; is a positive integer can be proved in general by induction. Note that 
Ey = | and equate the coefficients of x?" in (16.87) to obtain 


2 2. a 
Base & a ae € ey) ee ety | IGE) 


If all the numbers up to E2,_» are integers, then by (16.88), E2, is an integer. 


Based on the fact that tanx = sing = sin.x sec.x, one can show that the series for 


sinx multiplied by the series for sec x produces a series in which the coefficient of 


2-1, ee : 
Ok-DI is a positive integer. Thus, the tangent numbers are integers. 
The Euler numbers are given by Ex, in the formula 


xk 


CO 
sec x + tanx = > Ex ae 
k=0 


With even subscripts, Euler numbers are called secant numbers; with odd sub- 
scripts, tangent numbers. Euler determined” that 


1 1 1 gp 2k+1 


| 32k+1 + SOkF1 — FOkFI = pep! F 


Qk: (16.89) 


To prove (16.89), Euler expanded the partial fractions on the right-hand side of 
(15.46) as geometric series. Note that this can be done for |2x| < 1 or |x| < 5. Euler 
completed the proof by a change in the order of summation. 


16.9 Landen and Spence: Evaluation of ¢ (2k) 


Landen used the logarithm of —1 to evaluate 77° , + and, more generally, )°°°_, sr: 


He showed that the dilogarithm could be used for this purpose but complex numbers 
were required. Thus, Landen here succeeded where Euler had failed. Landen started 
his paper of 1760,°’ “A New Method of Computing the Sums of Certain Series,” 
with the determination of the values of log(—1). He observed that if x = sinz, 
then 


56 Bu. 1-10) § 224. E212. 
57 Landen (1760). 
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or 


Zz x 
V—-1 Vx2— 1° 
where the dot notation indicates the derivative. 
He integrated, taking z = 0 where x = 0, to get 


z : x+Vx2-1 
— = log —_——_. 
/—l : V/—l 
For z = 5 and x = 1, he had 
1 1 uA 
_ 
ses 2/—-1 
and consequently 
a 
log V—1 = -——. 
° 2/—-1 


Presumably because the square root of a number must have two values, Landen 
concluded that 


ios(—1): = 26/1 = + (16.90) 
Landen’s fundamental relation for the dilogarithm was 
‘el I 1 1 
Lin(x) = a Ta log x 5 (log x)? — Lig (=). (16.91) 


His proof was straightforward; note that he decided to use the minus sign in (16.90): 


—2 —3 


ge a =lo 
eae iia ee 
x 
= logx 4 log 5 + log(—1) 
~— +logx +1 
= + log x + lo 
| g Pat 


x x xo? a zi 2 ‘ 
; z 7 Het = logx 4 7 flog x) +Lin(x) +C. (16.92) 


He set x = | to find that 
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the value of which he derived by setting x = —1 in (16.92), thus obtaining 


ee) 1 1 2 
—C =2) 1 = 2Lin( 1) log(—1) 4 5 (lo ). (16.93) 


Pe 


= 
Il 
= 


Since 


2 ie wk oo 
2Li2(—1) = 2(1- 5 so )= 2(1 =)(1 Bagh ag -) 


=oae es © 
ae 


n=1 


equation (16.93) became 


eu’ 1 TO ( ie ) (16,94) 
Fe Wo Le ey 
Landen took log(—1) = es to derive (16.92), but he took log(—1) = at 


in equation (16.94). Although he did not explain this, he clearly wished to obtain a 
positive value for the series )°°—_, sy. So (16.94) simplified to 


2 


1 lo 
Se Bee 16. 
5C ar a (16.95) 


completing Landen’s proof of (16.91). 
Landen also considered the polylogarithmic functions, for —1 < x < 1 withna 
positive integer, defined by the series 


Lin@) == es ie (16.96) 


The n = 2 case would denote the dilogarithm. More generally, for all complex 
values of x 


i oy [ a) dt. (16.97) 
0 


To obtain a formula for Li3(x) corresponding to (16.91), Landen divided (16.91) 
by x and integrated. Observe that if we set u = i, we have 


Liz (4 
i (4) ic _ Rye een = Lin () see 
XxX u Xx 


Note that because Landen wrote Liz (+) as a series, he had 


; 1 x7! 5 aes x73 
Lis : = r t 3 | 33 free, 
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After integration he arrived at 


2 


P se T 
Liz (x) = 3 log x 4 


Jat 


where the constant of integration C can be seen to be zero upon setting x = 1; thus 
Landen found 


2 1 3 fl 
log x) eg OB) + Lig z +C, 


2 


7 8 wa 
Li3(x) = 3 logx 4 


3 Ja 


Similarly, by successive integration he discovered 


2 1 3 cs 1 
log x) — 573 (osx) + Lig aie (16.98) 


Lis(x) sy bs AG yr 4 TU 7m ye a ie if 1 
u(x) =2 > [+ —(lo — = z(logx) — — (logx)* — Lig | — 
: aan ee Bm Kp fet ae aye aE: 
(16.99) 
and 
enc x2 
Lis(x) = (2° )1 —_ (log x)? 
is (x) ( a) ogx + e (log x) 
1 1 1 1 
soe Ss ail ieee | alae | hy 16.100 
eae Se ee () eee 


Landen put x = Fi in (16.98) and saw that 


: 1 . Led 
Li3 (=) Li3(V¥—1) = —2V—-1 (1 ata -) ; 
using (16.90) he could write 


aa 1 1 (A . a t \? 1 
lo | lo lo =--~/-1 . 
ae eS Ot ( . +) 13 ( s5) 16 


Thus, he arrived at the formula 


1 goat =, (16.101) 


14 f++- = —, (16.102) 


Landen pointed out that these formulas could be continued indefinitely, but he did 
not indicate a connection with Bernoulli numbers. 


424 Zeta Values 


To further understand Landen’s work, write equation (16.91) in the form 


2 
Lip@) + Lip (=) = 5 ! ca log x — ; (log x)”. (16.103) 

Since he was working with the series for Liz(x) and Lip(4), Landen noted that 
the values of x were those for which both series converged. Such values of x are 
given by x = e!°, though he did not write that specifically. As we have seen, he 
used only x = +1, +i and the corresponding principal values of 6 are, respectively, 
6 = 0,2,%4, — %, since the principal values fall in the range —7z < 0 < 2. However, 


’ ye ye) ‘ 
we have also seen that Landen took e’? = —1, sometimes with @ = z and at other 
times with 9 = —z. The reason for this can be understood when one takes x = e’? in 


(16.103) and arrives at a mistake: 


| | 20 cos3¢ 
Lin(e™) + Lis(e™™) =2 (cos0 t+ 4 -). (16.104) 


Substituting 
2cosko = elk) + e ik? 


in (16.103) yields 


b= Pa = 9. (16.105) 


~ 


=1 


where the right-hand side should actually take the form, as discussed in Section 20.6, 
of the second Bernoulli polynomial, multiplied by a constant; in other words, the term 
+4 6 on the right-hand side should actually be —74 0. 

Now in this connection, we point out that in section 53 of his paper, presented to the 


Petersburg Academy in 1753 and published in 1760, “Subsidium calculi sinuum,”>® 
Euler derived the formula 
oo 2 2 
p—1 Coské oe: wy 
GU He ae Oe (16.106) 
k=1 
Now set ¢ = a — 6 to obtain a correct version of Landen’s (16.105): 
oo 2 2 
cos ko 1 1 sd a 
= 6)? = 9? -_9+—. 16.107 
ae ger, Oe ag ge gE G vee 


k=1 


Initial appearances to the contrary, Landen’s formula (16.103) holds for 1 < x < 
oo, but not for 0 < x < 1; the left-hand side is not invariant under the transformation 
x= 7 . When x is changed to *, the result is a different formula, valid forO < x < 1. 


58 Eu. 1-14 pp. 542-584. E 246. 
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Now in a paper presented to the Petersburg Academy in 1778 but published in 
1811,°° Euler presented a dilogarithm formula that was invariant under x — 4: 


1 tI 
Lin(—x) + Lin oa 6 5 log x)’. (16.108) 


In section 27 of his paper, he gave this result in the form: 


2 3 4 
If po is ee 2 eer be ee : : : Petty 
1 4 9 16 x 4x2 9x3 16x4 


2 
I 1 9 
then X+Y= Bg Oe) 


and proved it thus: 


x 


x 
a ] d 
J; og ( ) x 
(= pen [rE a. 
x x 


x= fa 


Hence 
1 2 
X+Y= 5 (og) +C. 


To find C, Euler set x = 1 and used the fact, that he had discovered and had been 
using for more than forty years, that 


acd n-1 m2 
ae. = (16.109) 


Thus, C = a and (16.108) was proved. Observe that (16.109) follows from 


Ys= (16.110) 
n=1 


Euler could have found this by taking x = —1 and the principal value of log(—1) = 


w/—l: 


: ee a x | 2 IN 
X+Y= 2) ae = et 7 (loa eee it. 


n=1 


59 Eu. 1-16 pp. 117-138. E 736. 
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In this way, Euler could have given another proof of (16.110). Here note that Euler 
was not resistant to the use of complex numbers in this context. In fact, he remarked 
that Li2(x), when x > 1, was complex-valued, as can also be perceived from Landen’s 
formula (16.103). Euler had already given a half dozen proofs of the result (16.110), 
so perhaps he did not see a need for another. 

Interestingly, Spence published (16.108) in his 1809 book®! two years before 
Euler’s paper was published. Spence also proved corresponding results for polylog- 
arithms in a slightly different notation, though in this section we adhere to a uniform 
notation. William Spence was a Scottish mathematician who does not appear to 
have been affiliated with any university. As we mentioned in Section 15.1, unlike 
most British mathematicians at that time, Spence was familiar with the works of 
continental mathematicians, including the Bernoullis, Euler, and Lagrange. Spence 
was not familiar with Landen’s 1760 paper in 1809 when he wrote his work on 
logarithmic transcendents, or polylogarithms, though he makes reference to Landen 
in his preface,°* having discovered his work at the last minute. Now recall there was a 
mistake in Landen’s work, since he used a formula outside its domain of applicability. 
This led to incorrect results on the polylogarithms, though he managed to deduce the 
correct values of 


[ee] 


1 
52k) =) 


n=l 


by judiciously choosing the principal value log(—1) = a/—1 or a non-principal 
value log(—1) = —2 /—1, so as to yield a positive value for the sum. Spence, on the 
other hand, avoided this pitfall by working with Li2(—x) instead of Lig(x). 

In § 29 of his book, Spence stated the formulas 


Lip(—x) = Lin( -) (log)? ~ 2Lin(-1) (16.111) 
xX 
Li3(—x) = Lis (-:) == toga) = 2tie Dlg (16.112) 
x 2-3 


He verified (16.111) in an earlier section in exactly the same way that Euler had 
proved it. He proved (16.112) by dividing (16.111) by x and then integrating. To derive 
the general formulas for Liz, (—x) and Liz,_1(—~), he first assumed 


1 
Lig, (—x) = Lins ( ) AM +4 A” logx +... +A” dogx)". (16.113) 
x 
He took the derivative of (16.113) and multiplied by x to find 
Pe ae ahaie Jone _ (n) (n) (n) 2n—1 
Lign—1(—x) = Lizn-1 +A, +2A,” logx+---+2nA,,/ (logx)"™, 
(16.114) 


60 Bu. I-16 section 21, pp. 117-138. E 736. 
6! Spence (1809). 
62 ibid. p. xii. 
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then took the derivative and multiplied by x to arrive at 


1 
(Spee (ars ee hee (--) +2A9” —2.3A$” log x 
+++ (2n — 1)2n AS” (log x)". (16.115) 
Spence next changed n to n — 1 in (16.113) to obtain 
5 ; 1 = = 7 = 
Lb sea —Lin-2( = -) + AC) 4 AC toga +--+ AL) dogx)**?. 
(16.116) 


Equating the coefficients of the powers of log x in (16.115) and (16.116), he could 
write 


(n—1) (n—1) (n—1) (n—1) 
Ae — Ao A@ — Al (Oa Ae) — A2n—2 
De es ; ” 3h ; 2 4 ~~ : De Se 2n as 
1-2 2-3 3-4 (2n — 1)2n 
(16.117) 
and these equations produce 
(n—2) (n—2) (n—3) (n—3) 
Ao — Ao A = Ay Ao — Ao A@ — Ai 
a 5 GC ET Ghd = Ue oe ana 


(1) 
A” _ Ay 
Me Banat OR: 
From (16.111), AY = —4 and hence AS? = aaa Combining his results, 
Spence obtained the relations 
l Ar 1) 1) 
Lizn(—x) + Linn —. Ag? + AW logx + 2 — (logx)? + L — (log.x)3 
ie 2) aa 2) l 
+ (log x)* + —— (log x)> So "— Gppi osx)” (16.118) 
and 
1 as a 9) ui 9) 
be ae ee ea a (log.x) + (log x)* 
(n—2) 1 
ae ee 2n—1 
+ (log x)° 4 On D! (log x) . (16.119) 


To completely determine the right-hand sides of equations (16.118) and (16.119), 
Spence needed only the values of A” and Aye To find AS? and Aye. he set x = 1 
in (16.118) and (16.119) to obtain 
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At? = 2Lio, (-1) and A” =0. 


Thus, he had the final formulas of section 29 of his book: 


1 lo ay 
Lip, (—x) 4 Lins ( ) = Blin (yD hie yi 5 
(logx)* (log x)" 
+ 2Limna (I) $2 = S, (16.120) 
1 wea 
Lign—1(—x) — Lign—-1 ay = 2 Lign—2 (—1) log x + 2 Lizn_-4(—1) 
eee ae 
2 Li 1 pose : 
a 12n— 6 ( ) (Qn = D! 
(16.121) 


Formulas (16.120) and (16.121) are true for all integers n > 1. Forn = 1, one 
must define Li;(—x) and Lig(—x). Observe that Li,_;(—x) can be obtained from 
Li,(—x) for n > 3 by differentiating Li,(—x) and multiplying the result by —x. 
Spence therefore defined 


Liy (—x) = —logd +x) and Lio (—x) = 


ze A (16.122) 
l+x 
In his section 31, Spence considered the problem of finding C = 77°, oe For 
this purpose, he observed that 


SCS ce | 
—Limn(-1) = = kan = 5 K2n 2). (2k)2" 
k=1 k= k=1 
2 1 
=c-Sc= (1 - st) C. (16.123) 
To determine the value of C when n = 1, Spence set n = 1 and x = —1 in (16.120) 
to obtain 
log(—1))* 
Oa) = 219 (21) + Big —1) — a 


Taking the principal value of log(—1) as 7./—1, he found 


ee | A 
B= e 


2 


and with n = 2 he had 


2 4 
QLig(1) = 2Li4(—1) + 2Lid( p= cee + 2Lig( i av 


= 2(1 ;) List @ iat “ 
=— —3 i4(1) + 75 (1) — a7 
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Since he had already found that Li2(1) = uae , Spence could conclude 


x x4 4+ x4 


15 4 
q liad) = a5 ag oe or Lig(1) = 
Using similar reasoning, Spence successively determined the values of Lig(1), 
Lig(1), and Lijo(1). However, Spence was very interested in the numerical values of 
these quantities and he focused on methods that would yield good approximations. In 
any case, Spence obtained the correct formula for Lizy(—x) + Lian (— +), the formula 
he used for finding Liz, (1) for some values of n. 
Recall from our Chapter 2 the formula (2.35) of Euler, or (16.78): 


Be 1 aC 1}*- 1 92k— 1 2k Box 
a (Qk)! 


One might raise the question: Could Spence’s formulas be employed to produce 
Euler’s general result? A hint of this possibility appears when we rewrite (16.107) as 


cos 2mnp _ 2 1 
212 = b=. 16.124 
ys et aw 4 29 
Since the first three Bernoulli numbers are Bp = 1, By = 5, Bp = = é and using 


the definition (2.48), we see that the right-hand side of (16.124) can be rewritten as 


(() #00? + (7) ae + (3) m= mo 


the Bernoulli polynomial of degree 2. This connection between Bernoulli polynomials 
shown in equation (16.124) was explicitly illustrated in 1834, when Jacobi ascertained 
the remainder term for the Euler—Maclaurin series in terms of Bernoulli polynomials; 
this was clearly the same as the expression for this remainder in terms of Fourier 
series, found by Poisson in 1826. For a discussion of the remainder term found by 
Poisson and by Jacobi, see Sections 20.4 and 20.5. Based on the results of Jacobi 
and Poisson, we have the formula 


Bu($) = 2-1)" 20)! 9 


n=1 


0<¢<1, k>=1, (16.125) 
and then by differentiation, 


Box-1($) = 2(-1)* 2k — I)! RENO. Wee ed Pc ede) 


= (27n)?k-1 3 


where the range of ¢ in (16.126) is 0 < @ < 1 when k = 1. A further discussion of 
this is contained in Chapter 20. 
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In his paper, presented to the Petersburg Academy in 1774 and published in 1775, 
Euler repeatedly integrated (16.107) to obtain formulas (16.125) and (16.126) for 
1 < k < 4. He worked with the variable 6 = 2zr@ so that he got the polynomials 
B,(£); he apparently did not perceive that these polynomials were related to the 
polynomials obtained by Jakob Bernoulli when he expressed 1” +2” +---+(m—1)" 
as polynomials in m. 

To verify this relationship inductively, suppose (16.125) true up to some value 
k > 1. Apply the operation (2k + 1) 1 to both sides of this equation for 0 < x < 1 
to obtain 


2k+1 2k+1 
atitt 4 ( ae Lcanaeea| - ) Bax +c 


2k 


sin 27 nd 
(20n)2k+1 ” 


= 91) Oe 1) ps 


n=1 


(16.127) 


where C is the constant of integration, equal to zero when x = 0. Thus, the left-hand 
side of (16.127) is Bog41(x) and applying (2k + 2) fer to both sides, we get 


2k +2 2k +2 
12H ( - ) Bute (OFF 4) Bux? +C 


cos 27 nd 
(27n)2k+2’ 


= 9(=1)' Oko" SS 


n=1 


(16.128) 


where C is again the constant of integration. Setting x = 0, we see that 


_1)k 00 
a2) CEOs 1 


(270 )2k+2 n2kt+2° 


One more integration produces a formula similar to (16.127), in which we set x = 1 
to conclude that C also satisfies 


2k+3 2k +3 2k +3 
1+( 1?) Bite 7?) bx +( eee 


thus showing that C = Box+2 and the polynomial on the left-hand side of (16.128) 
is Box+42(x). We have thus proved (16.125) and (16.126) and have provided another 
proof of (2.35). 

Recall that Raabe defined Bernoulli polynomials, that he called Jacob Bernoulli 
functions, as the right-hand sides of equations (16.125) and (16.126). Moreover, he 
defined the Bernoulli numbers by equation (2.35). These definitions allowed him to 
easily prove some interesting properties of Bernoulli polynomials. For example, from 
(16.125) and (16.126) it was immediately clear that 
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Box (x) = Bax —x), Box_-1(x) = —Bog_11 — x). (16.129) 


As we saw in Chapter 2, (16.129) was proved by Jacobi; see (2.54). It can also be 
verified by using Jacobi’s method of generating functions for Bernoulli polynomials, 
a method defined by 


oo k 
te t 
aay Te Be) F 


A special case of Raabe’s (2.56) is given by: 


1 
By (2x) = 2-1 (a: + By (: a ;)) (16.130) 
Next observe that (16.129) and (16.130) can be employed to show that 
(16.125) follows from Spence’s formula (16.120), taking (2.35) as the definition of 
Bernoulli numbers. Though Spence did not prove this, it is interesting to work it out: 


Set x = e2"/(2-9) in the left-hand side of (16.120), yielding the series 


23° cos 2k 1 


k2n 
k=1 


so that the right-hand side of (16.120) becomes 


(-1)"(20)" 1 1 \ (2n 1)? 
ran (ta) mer (18) (4) me 
1 2n 1 
(2b) C)(e-f) me 
1 Fn 1 2n—2 1 2n 
+(1-5) (7) (6-5) B- (6-5) | (16.131) 


1 2 ¥ 
5 (Bon (x) 7 Bon (—x)) =x 4 e Boxe Cae 2B, 


Now note that 


so that 


(ete 2n (\2 
(6-5) =) mm 
[Ae Fp 1 1 
=2 (o = 5) 75 (2, (° 5) + Boy (; )) (16.132) 
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Bon 2n 1\* Bon2 
-2( 32+ ('2) (6-3) q2n-2 1 


2 
= — sR (8 + i) (26 — 1)? Bono + ) 


ee (c 1) 5 ( Bane —1) + BaQ— 20))) (16.133) 


and 


22n-1 


Upon adding (16.132) and (16.133), we see that the expression in (16.131) is 
equal to 


1 1 1 1 
xin (Ban (29 1) + Bo,G 2¢)) 5 (Bm (o 5) + Bon (5 6). 


(16.134) 


Using (16.130) we have 


rl 
53 (Ban (2 — 1) + Ban (1 — 24) 


)2n 
= a B 0) =— s + B 0) { B : 0) { B 1 0) ; 
5( 2n ( 5) 2n ( ) T 2n ( ) T 2n( ) 


then, by means of equation (16.129), (16.134) simplifies to 


1 
5 (Bon (@) + Bon (1 — $)) = Bon(@). 
This proves that Spence’s formula (16.120) implies the Euler-Raabe formula 


(16.125) and hence also (16.126). 


16.10 Exercises 
(1) Show that 


1 : : : : etc. = Sx? 
35 5575 9S 1536’ 
ip: (.. A | 2 4 oe La 
36 § 56 © 76 © 96 960’ 
: 1 1 1 1 ets 61x 
37 57) 77g "~~ 184320’ 
is aves all, Scie s _ 17r8 
38 5 58 ' 78 ' 98 ° ~ 161280° 


See Eu. I-14 p. 81. 
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(2) Express Newton’s series (16.11) as a Dirichlet L-series. Do the same with 
Euler’s series (16.36) and (16.38). 
(3) Divide Takebe’s series 
BD BE BeAa Oe DAO” 


1 
~(arcsin x)? = f — + fds 
2 2 3 4 #3:56 3-5-7 8 


by V1 — x? and integrate term by term over (0,1) to obtain 


See Eu. I-14 p. 184. 
(4) Show that 


je 1 1 Uh es ees oF _ 1181820455 _ 94 
224 © 324 © 424 © 524 "1 +2-3--+25 546 

ete 1 1 Ny eget as Dae 16977927 96, 
7 a ie 1-2-3---27 2 


See Eu. I-14 p. 185. 
(5) Prove Eisenstein’s formula: 


oo 00 : 
. —1 o. T Ti mi-—qnri —oB-2n1 
obi S ( v _l@ (ct | aa at) 3 e at 
4 (G+ BY 4 x4 4 (6 +5)4 
See Weil (1989a). 


(6) Prove that for |a| < m andO <s <1, 


x vk ( z I ) 
ka (2k+1)r +a)’ ((2k+1)a —a)s 
i Se (—1)'' sinka 


~ P(s)sin kins 


k=1 


Deduce the functional equation for L(s) = 9 (—1)* (2k +1). 

See Malmsten (1849). Carl Malmsten became professor of mathematics in 
Uppsala in 1841; during his career, he made significant contributions to the 
development of the Swedish mathematical tradition. See Garding (1994). 


(7) Prove Goldbach’s formula 


1 1 1 1 1 x* 
14 14 1+—4 Coes. 
23 5 33 2° 3 72 


See Fuss (1968) p. 197. 
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(8) Prove Euler’s formula 


aliens ees 2! ena i = a 
' 95 oe eae es Ge SS Same aoe Wey ae a 


See Fuss (1968) p. 190. 


(9) Prove Euler’s formula (16.52). See Fuss (1968) pp. 181-182. 
(10) Show that for tf = s — 5 andO <s <1, 


(11) 


(12) 


1 xsl — x7 
dx = —m tan mt. 
0 1-x 


Euler took successive derivatives of both sides with respect to s and set s = 4 


2 
or t = O. Verify that after taking the first derivative and setting s = > the 
result is 

M2 In x dx : 
7 = -T ; 
0 1-x x2 
or 


1 ] 2 
j ay dy=—., 
gla ye 8 


More generally, show that 


1 2k-1 
(dn y) k (92k Bok _ ox 
dy = (-1)°Q”™ — 1) —— : 16.135 
[ ae ep ima ( ) 
Euler wrote down the formulas for k = 1,2 and 3. See Eu. I-17 p. 406. 

Show that fort = s — 5 andQ<s <1 


1 xsl + x75 
————— dx = msec mt. 
0 1+ x 


Use the method of the previous problem and the series for sec u given in this 
chapter to prove the formula for Euler numbers 


1 (In y) 2k . Enyn kt! 


0 1+ y2 v= Q2k+2 


Euler computed the Euler numbers E., for k = 0,1,2,3,4 to obtain 
1,1,5,61, 1385, respectively. See Eu. I-17 pp. 401, 405. 
Let P= i oh dy, and let Q denote the integral in (16.135). Observe 
that 

| (In yy 


et me ee Le 
o Fy 
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ae 1 1 (In y)2k-1 5 
~~ 92k 0 l-y 


Deduce that 


1 2k-1 
(In y) , Box 2k 
——— dy = (-1)* — 2 ; 
[ fy See), 


lq 2k—1 B 
/ (In y) dy = (—1)E QE! — 1) 2k 
6 2k 


l+y 


Euler gave this argument in Eu. I-17 pp. 406-407. 
(13) Show that 


; 7 T(2k) 
/ y” (In yy ‘dy = pry a 
0 


From this, compute the ¢ and L-series values 
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In 1737 Euler used integration to exactly evaluate °° S, but he regretted 
that the method did not extend to k > 2. In 1774, he finally found what he 


was looking for. See Eu. I-17 pp. 428-451. 
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The Gamma Function 


17.1 Preliminary Remarks 


In the 1720s, several mathematicians attempted to interpolate the sequence 


L2OU 29a 96H Sh Cie 


Clearly, the interpolating function would have to satisfy n! = n((n — 1)!) and 
0! = 1. To avoid confusion, we give the definition of the interpolating function f(x), 
satisfying modern requirements: 


PotD Sais). .¢ pS, (17.1) 


implying that when n is a positive integer, f(n) = (n — 1)!. Now in his 1811 
Exercices de calcul intégral,! Legendre denoted the function f by I’; ina paper on the 
hypergeometric function,” Gauss represented f(x+1) by M(x). Since some results on 
definite integrals could be more conveniently stated using Legendre’s definition, even 
influential German mathematicians such as Jacobi and Dirichlet began to employ I’- 
function notation. Thus, the interpolating function for the factorial became known as 
the gamma function. 

Euler and Stirling made significant contributions to this problem starting in the late 
1720s. They worked independently, Euler in Russia and Stirling in Scotland; their 
approaches and aims were also distinct. In the mid-1730s, they came to know of each 
other’s work and had a brief correspondence. Always the algorist, Euler was interested 
in obtaining analytic expressions for the interpolating function f(x). His first paper 
on the subject, written in 1730, gives two different representations of f(x), one as 
an infinite product and the other as a definite integral. On the other hand, Stirling 
was a numerical analyst interested in finding efficient methods for computing f(x). 
He was undoubtedly extremely experienced in computation and demonstrated that he 
knew the values of many mathematical constants to several decimal places. Without 


! Legendre (1811-1817) vol. 1, p. 277. 
2 Gauss (1813). See also Kikuchi (1891) for an English translation. 
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giving an explicit analytic formula for f(x), but making use of Newton’s method of 
interpolation, he calculated the value of 5! a ( 3) as 0.8862269251. He recognized 
this as ae and, indeed, this is the correct value of r( 3). 

There was a common feature in the thinking of Euler and Stirling: They both 
believed that there was only one reasonable or logical interpolating function f(x). 
Thus, Euler did not prove in his first paper that the integral and infinite product 
representations of f(x) were equal for all x > 0, but merely that they were equal for 
positive integral values of x. Similarly, Stirling thought that his numerical methods 
gave the value of the unique interpolating function f(x). The later work of H. 
Bohr and J. Mollerup from around 1920 showed that to obtain uniqueness of the 
interpolating function one must assume the convexity of In f(x), in addition to the 
two above-mentioned properties f(x + 1) = x f(x) and f(1) = 1. 

Leonhard Euler (1707-1783) was born in Basel, Switzerland, and studied at the 
University of Basel from 1720-1724. After this he studied independently, concentrat- 
ing on mathematics, physics, and astronomy, under the guidance of Johann Bernoulli, 
with whom he met once a week. Adhering to this regime, Euler quickly became an 
excellent mathematician and by 1725 he began seeking a position. Failing to find one 
in Switzerland, he moved to Russia in 1727 to join his friends Daniel and Niklaus 
(also Nicolaus) II Bernoulli at the newly founded Petersburg Academy. The Bernoulli 
brothers received their appointments when their father Johann declined a position 
and persuaded the Academy to employ his sons. Euler was originally appointed to 
a position in medicine, prompting him to brush up on his anatomy, but he ended up 
getting a situation in mathematics when Niklaus II died unexpectedly before Euler 
arrived in St. Petersburg. Euler enjoyed a very stimulating scientific collaboration with 
Daniel until the latter returned to Basel in 1834. Euler also developed a friendship with 
Christian Goldbach (1690-1764) from Prussia, whom Clifford Truesdell described as 
“an energetic and intelligent Prussian for whom mathematics was a hobby, the entire 
realm of letters an occupation, and espionage a livelihood.”* Euler and Goldbach 
corresponded extensively with each other, and Goldbach sometimes suggested prob- 
lems, stimulating Euler to important mathematical discoveries. Euler spent 1741-66 
in Berlin and then returned to St. Petersburg where he died, mathematically active 
until the end. 

Euler became interested in the interpolation problem when it appeared in a 1728 
paper presented by Goldbach to the St. Petersburg Academy. Goldbach also mentioned 
the problem in his letters to Daniel Bernoulli who may have discussed the matter 
with Euler. Bernoulli outlined a solution in a postscript to a letter to Goldbach dated 
October 6, 1729.* He let A stand for an infinite number. Then the general xth term of 
the factorial sequence was given by 


(a+z)" 2 3 4 A 
see l+x 2+x 3+x A-14+x/]° 


3 Truesdell (1984) p. 345. 
4 Fuss (1968) vol. II, pp. 324-325. 
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He noted that when x = ; and A = 8, the value of the preceding expression was 
approximately 1.3005. He had made a computational error here, and the value should 
have been 1.329, as he observed in a letter two weeks later. This value of (3) ! is correct 
to three decimal places. Even at this early stage of his career, Daniel Bernoulli did not 
pursue this problem any further, and it was left to Euler to initiate and develop the 
theory of the gamma function. In fact, Daniel was primarily a mathematical physicist 
and after middle age, his interest in pure mathematical questions waned. 

Euler’s letter to Goldbach, dated October 15, 1729,° gave the value of the 
interpolating function (Gm + 1), using Legendre’s notation, as an infinite product 


1.2” gi-m _ 3m 31—m . qm gi-m . 5m 
lim 2+m 34+m 44m 


(17.2) 


He observed that the infinite product reduced to m! when m was a positive integer, 
though he verified this only for m = 2 and m = 3. He also noted in the letter that the 
infinite product (17.2), when terminated after n terms, and after cancellation of terms 


kk", k7™, k =2,3,...,n, could be written as 
1-2-3---n-(n+1)” (173) 
(l+m)(2+m)---(n+m) ; 
This implied that the product (17.2) was equal to 
1-2---n- 1)” 
aad (17.4) 


lim : 
noo (1 +m)(2+m)---(n+m) 


In his 1812 paper on hypergeometric functions,° a paper published in 1813, Gauss 
denoted the function defined by the limit (17.4) as H(m). Thus 


; 1-2---n-(n+1)” 
lim 
n>oo (1+m)(2+m)---(n+m) 


=T(m) =T(m+ 1). (17.5) 


Euler recognized importance of this interpolating function but he did not have a 
notation for it. Euler simply wrote 1 - 2 -3---x, for some real positive number x. 
On some occasions he used the symbol [x] for "(x + 1), but this was a temporary 
situational device; he used square brackets for other functions as well. 

Soon after he wrote his letter to Goldbach, Euler presented to the Petersburg 
Academy a long paper on the subject, although the Academy did not publish it until 
1738.’ In this paper, Euler wrote that after he found the product (17.2), he set m = 5 
to obtain an expression that, as he recalled from Wallis, had the value cee 

Since Wallis had obtained this result while investigating the area under y = 
v1 — x? on the interval [0, 1], Euler was led to consider the integral ie x°(1 —x)"dx 


and eventually arrived at the formula 


5 Fuss (1968) vol. I, pp. 3-7. 
© Gauss (1813) or Gauss (1863-1927) vol. 3, pp. 123-162, especially pp. 144-152. 
T Bu. I-14 pp. 1-24. E 19. 
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1 
Tim+l= / (—Inx)"dx. (17.6) 
0 


Euler proved only that the integral (17.6) was equal to the product (17.2) when 
m Was a positive integer. From this, he concluded that the integral would equal the 
product for all real m > 0. He then took m = 5 in (17.6) so that, based on Wallis’s 
value, he got 


1 
[ (—Inx)2dx = = (17.7) 


t 


Note that a change of variables x = e~ = followed by integration by parts, gives 


the probability integral 


CO 
dt 17.8 
le wns 


so that Euler had in fact found the value of the probability integral in the form (17.7). 
Recall from Sections 3.1 and 3.2 that, in modern notation, Wallis had guessed the 
value of the integral 


: 1\g p! q! 
fi (l—x?)*dx = = (17.9) 
0 Ga lssigt py (paler g) 

When p was a positive integer, Wallis could use the first equation in (17.9) to 
compute the integral; when g was a positive integer, he could use the second equation. 
In a paper presented to the Petersburg Academy in 1739,® Euler gave a proof of (17.9), 
extended to the case where both p and g were not necessarily integers. Interestingly, 
his proof was modeled on Wallis’s technique of defining two sequences and then 
combining them. In effect, Euler proved that 


1 
i C= a= EADIEAG). (17.10) 
0 T(p+4q) 


though he stated the result in this form only in 1766.° In his 1739 paper, Euler did 
not clearly specify when he was taking p or g to be a positive integer and when he 
was not. Interestingly, in 1772,!° Euler was very clear about this, proving the results 
for p and q integers. But then he invoked a form of “the principle of permanence 
of equivalent forms” and wrote that the results would hold for p and q real values. 
We note that in section 52 of this paper, Euler gave a result equivalent to Gauss’s 
multiplication formula (17.15). 


8 Bu. 1-14, pp. 260-290. E 112. 
9 Bu. I-17 pp. 268-288. E 321. 
10 Bu. 1-17 pp. 316-357. E 421. 
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Sometime in the 1740s or perhaps earlier, Euler discovered a connection between 
the gamma function and the trigonometric functions, in the form of his reflection 
formula: 


ra@)rd—x) = 


ey (17.11) 
Sin 7X 


Note that in the integral (17.10), p and q are positive. Therefore, if we take p = x 
and g = 1 — x, so that 0 < x < 1, then (17.10) changes to 


1 x-1 
[ Gop =T(x)PU—x), O<x <1. (17.12) 


Now, following Euler,!! we set t = re then (17.12) is transformed to 


leva) peck 
i dy =T(x*)TU-x), O<x<l. (17.13) 
o dey 
Recall that Euler had proved that the integral in (17.13) was equal to =*—; see our 


equation (15.40). 


Euler made a curious observation in his 1729 letter to Goldbach. He wrote that 


the value of (17.2) when m = 5 was Vy »/—1log(—1), equal to the square root of 


the area of a circle with diameter 1. This amounted to /—IIn(—1) = z. At that 
time, mathematicians did not have a clear idea about how the logarithm of a negative 
number should be defined. Leibniz and Johann Bernoulli had some correspondence 
on this point in the 1710s, but these discussions brought forth nothing of real value. 
Eventually, Euler produced a complete definition of the logarithm of a complex 
number, including its property of being multivalued. The question is, did Euler have 
a good understanding of this definition in 1729? Perhaps he did not. Roger Cotes’s 
posthumous work, Harmonia Mensurarum, published in 1722, had the formula 


/—llog(cos 6 +isin 0) = 6 


and Euler’s formula covered the particular case when 6 = zr. Moreover, Cotes’s result 
had an error in sign and this error reappeared in Euler, if we take the principal value of 
log(—1) to be iz. It seems reasonable to draw the conclusion that Euler got his result 
from Cotes. However, the formula of Cotes set Euler on the right track toward his own 
more conclusive results, finally written up in the 1740s. 

Euler did not deal with the question of the convergence of the infinite product 
(17.2). It was not the practice among mathematicians of the eighteenth century to go 
into the details of convergence problems. However, the manner of Euler’s expression 
in some cases leads us to believe that he had clear ideas about what was meant by 
convergence. For example, in (17.2) Euler did not cancel the factors, showing us that 


1 Ey. 1-15 p. 558; E575, $38. 
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here he was not unaware of convergence issues. One may easily check that the nth 
term of the product is 


ni-™ . (n+ 1)™ 1\” m\-1 m(m — 1) 1 
=(14 14 =14 + O 
n+m ( *) ( =a 2n? (=) 


and thus that the infinite product converges. 

The eighteenth-century mathematicians produced an enormous body of analytical 
results without a substantial discussion of convergence. The first mathematician to 
seriously think about convergence issues was Carl Friedrich Gauss (1777-1855). Like 
Euler, he had an extremely broad range of interests, covering almost every area of 
pure and applied mathematics. His paper on the gamma function was a part of a 
larger work on hypergeometric series published in 1813. He founded his study of 
convergence on the theory of limits of sequences. In an unpublished early work, he 
discussed concepts such as the upper and lower limits of sequences. It is difficult to 
determine the influences informing Gauss’s work. Of course, he was extremely well 
read and was very familiar with the works of his great predecessors. But he appeared to 
prefer to work in isolation. So it is not clear what motivated him to study convergence 
of infinite series and products, besides a desire for greater mathematical rigor. Thus, 
in the 1813 paper mentioned earlier, Gauss showed that the limit in (17.4) existed. He 
also gave a new method of deriving Euler’s results (17.7), (17.10), and (17.11). At the 
heart of Gauss’s new method was the summation formula 


1 2b a+) -bO+D) | aatN@+2)-bO+ DOD 
hee ieee. 1-2-3-c(c+ 1)(c +2) 
-~ LOVCe=¢=0) 


~ T(e—a)(c —b)’ 


(17.14) 


where a, b, c were complex numbers with Re(c — a — b) > 0. He gave a completely 
satisfactory proof of this formula, given in Chapter 23. 
Gauss also found the multiplication formula for the gamma function: 


pve 1 2 n—-1 
we rer (2+=)r(z+2)or (e+ * ) 


= (2r)"2 P(nz), (17.15) 


where 1 was a positive integer. The reflection formula (17.11) suggests that the 
inspiration for this must have been the similar formula for sin naz discovered and 
published by Euler in his 1748 Introductio in Analysin Infinitorum. In slightly modified 
form, Euler’s formula was 


: ren te . ( ~) 5 ( ) ; ( — 
sin nz = 2 sin zwzsinz{z+-—]sinwz({z+-—J]---sinz {z+ —]. 
n n n 


(17.16) 
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Euler also gave a special case of (17.15):!? 


var (*)r(2)--r (7) = en. (17.17) 
n n n 


Slightly before Gauss’s paper was published, Legendre discovered the duplication 
formula,!? the n = 2 case of Gauss’s formula (17.15). Recall that Euler had a formula 
equivalent to (17.15) in a paper of 1772.'4 Legendre’s proof employed the integral 
representation of the gamma function, and this in turn suggested the problem of 
deriving the properties of the gamma function using definite integrals. At that time, 
definite integrals were appearing in many areas of mathematics and its applications. 
For Euler, this topic was a life-long interest; he had already evaluated several definite 
integrals by means of a variety of techniques. S. P. Laplace and Legendre also pursued 
the study of definite integrals, of great usefulness in solving problems in probability 
theory and mechanics. The method of Fourier transforms, originated by Fourier in 
his work on heat conduction and its applications to wave phenomena, also produced 
numerous definite integrals. 

By 1810, several French mathematicians had published papers whose aim was 
to evaluate classes of definite integrals. In 1814, Cauchy wrote a long memoir on 
definite integrals, published in 1827, the first of his many contributions to what would 
become complex function theory. A decade later, Cauchy gave a precise definition of a 
definite integral in his lectures at the Ecole Polytechnique; he then proceeded to define 
improper integrals and their convergence. 

Dirichlet, though a Prussian, studied in Paris in the mid-1820s. He mastered 
Cauchy’s ideas on rigor and applied them to the series introduced into mathematics 
and mathematical physics by his friend Fourier. Even in his first paper on Fourier 
series,!> Dirichlet recognized the importance of extending the definite integral to 
include highly discontinuous functions. He even made use of improper integrals in his 
number theoretic work. He employed the integrals te cos x” dx and tee sin x? dx, 
closely related to the gamma function, to obtain a remarkable evaluation of the 
quadratic Gauss sum. We discuss this in Section 19.6. In his famous work on primes 
in arithmetic progressions, Dirichlet used Euler’s integral formula for the gamma 


function in the form 
1 1 s—l r 
/ xnl (i (<)) ges (17.18) 
0 x ns 


to represent certain Dirichlet series as integrals. We discuss this in detail in Chapter 
28. Dirichlet’s number theoretic work motivated him to further investigate the gamma 
function within the theory of definite integrals. He wrote several papers on the topic, 


12 Eu. 1-19 pp. 439-490, especially § 46. E 816. 
13, Legendre (1811-1817) vol. I, p. 284. 

14 Eu. 1-17 pp. 316-357. E 421 section 52. 

'5. Dirichlet (1829b). 
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including one dealing with a multidimensional generalization of Euler’s beta integral 
(17.10).!® In a paper of 1829,!7 he also studied the gamma integral with a complex 
parameter. 

Dirichlet’s interest in definite integrals, expressed through his publications and 
his lectures at Berlin University, created an interest in this topic among German 
mathematicians. Thus, in 1852, Richard Dedekind, who did great work in number 
theory, wrote his Gottingen doctoral thesis on Eulerian integrals. Riemann, greatly 
influenced by Dirichlet, made brilliant use of definite integrals, and the gamma integral 
in particular, in his great 1859 paper on the distribution of primes. In this paper, he 
expressed the zeta function as a contour integral from which he derived the functional 
equation for the zeta function. This work of Riemann inspired his student Hermann 
Hankel (1839-1873) to find, in 1864, a contour integral representation for '(z), valid 
for all complex z except the negative integers.'® 

The gamma function also played a significant role in the development of the theory 
of infinite products. In an 1848 paper, |? the English mathematician F. W. Newman 
explained how an exponential factor en in(+ X)ern ensured the convergence of 
the product []2.,(1 + yen, Using this product, he obtained a new representation 
for the gamma function. Oscar Schlémilch (1823-1901), a student of Dirichlet, 
published this result in 1843,7° taking an integral of Dirichlet as a starting point of the 
proof. Schlémilch’s work was based on the evaluation of definite integrals. In 1856, 
Weierstrass gave a foundation to the theory of the gamma function by defining it in 
terms of an infinite product.”! In fact, the ideas of this paper inspired him to construct 
entire functions with a prescribed sequence of zeros. 

The gamma function is one of the basic special functions, cropping up again and 
again. Consequently, mathematicians have tried to derive its properties from several 
different points of view. In 1930,?? Emil Artin observed that the concept of logarithmic 
convexity, used by Bohr and Mollerup to prove the equivalence of the product and 
integral representations of the gamma function, could be employed to characterize 
and develop the properties of this function. While Artin worked with real variables, 
in 1939 Helmut Wielandt gave a complex analytic characterization. The defining 
property other than the obvious f(z + 1) = zf(z) was that f(z) was bounded in the 
vertical strip 1 < Rez < 2. Wielandt, a group theorist, did not publish his theorem. 
Instead, he showed it to Konrad Knopp, who included it in the fifth edition of his 
Funktionentheorie 11.77 


16 Dirichlet (1839a) and (1839b). 
!7 Dirichlet (1829a). 

18 Hankel (1864). 

19: Newman (1848). 

20 Schlémilch (1843). 

21 Weierstrass (1856). 

22. Artin (1964). 

23 Knopp (1941) pp. 47-49. 
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9A 7A 5A 3A \A O Ay A3 T As Aq Ao 
Figure 17.1 Stirling: gamma values by interpolation. 


17.2 Stirling: (3) by Newton—Bessel Interpolation 


James Stirling gave a remarkable numerical evaluation of T'(5). He tabulated the 
base 10 logarithms of the twelve numbers 5!,6!,...,16! and then applied the 
Newton—Bessel interpolation formula to obtain the middle value (5) !. Then by 
successive division, he computed I (5) = (- 5) ! to ten decimal places and recognized 
it to be ./7. See Section 9.2 for a discussion of interpolation formulas. In the Methodus 
Differentialis,* proposition 20, Stirling described the interpolation formula: 

He first supposed an even number of equidistant ordinates and, using a diagram 
similar to Figure 17.1, he denoted them 9A, 7A, 5A,...,As, A7, Ag. Note that these 
values refer to heights of the line segments. He called ; A and A, the middle values 
and set A to be their sum. Thus, he had nine differences of these ten numbers, such 
as 7A — 9A,5A— 7A,...,A9 — A7. He called the middle difference a. Taking the 
eight differences of these nine differences, he denoted the sum of the middle two terms 
as B. Next, Stirling called the middle term of the seven differences of the eight second 
differences b, and so on. He took O to be the midpoint of ;A and A, and let T be an 
arbitrary ordinate; he let 5 be the ratio of the distance between O and that point whose 
ordinate was T, and the distance between ;A and Aj. Note that this last distance is 
also the distance between any two successive ordinates. Stirling wrote the formula as 

At+az 3B+bz 2-1 S5CHez 2-1 2-9 
T= x x x 
2 2 4-6 2 4-6 8-10 
TD+dz 2-1 2-9 2-25 
x x x 
2 4-6 8-10 12-14 
OF ter 2-1 2-9 g=-95 2-49 
x x x x 
2 4-6 8-10 12-14 16-18 


free, (17.19) 
Stirling also noted that z was positive when T was on the right side of the middle 


point O, as in Figure 17.1, and negative when it lay on the left-hand side. To write this 
in modern form, set 


= Stirling and Tweddle (2003). 
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A= if Aj= + : A= Sy, A2= + hi t 
iA=fis 5}? 1=f\s 2 2A=fis aa ke 2=f\s 2 » CIC. 
Then the Newton-—Bessel formula, discussed in Section 9.2, is given by 
rls+ 1 ee 1 f h\ | f JA 1 Af h 
UE SON aoe Nh op rake oD ae: 
x(x —-1) 1 2 3h 2 h 


Aged 
x(x ale D a3 (s- >) aes (17.20) 


Stirling’s expression is obtained by setting z = 2x — 1 and combining pairs of terms 
in (17.20). 
In Example 2 of Proposition 21, he explained his approach to the evaluation of I(x) 
for x > 0: 
Let the series [sequence] to be interpolated be 1,1,2,6,24,120,720, etc. whose terms are 
generated by repeated multiplication of the numbers 1,2,3,4,5, etc. Since these terms increase 
very rapidly, their differences will form a divergent progression, as a result of which the ordinate 
of the parabola does not approach the true value. Therefore, in this and similar cases I interpolate 
the logarithms of the terms, whose differences can in fact form a rapidly convergent series, even 


if the terms themselves increase very rapidly as in the present example. 


Stirling interpolated the sequence 
In O!, In 1!,In2!,...,Inn!,In(vn+1)!,.... 
The first difference would be 
In(n+ 1)!—Inn!=In(n+ 1); 


the second difference would be written as 
1 1 
Inn+1)—-Inn=In{l+—)*-; 
n n 


and the third difference would be written as 


n2 


In(n+1)—2In(v)+In(n— 1) =In (1 -3) eA 
n 


Thus, we can see that the successive differences get small rapidly when n is large 
enough, though In n! increases. This means that if one desires to find the value of 
Tn + 1) for a small value, n = 5 then one must first compute In '(m + 1) for a 
larger value of n by the method of differences and then apply the functional equation 
(n+ 1) = nm). Stirling thought that the interpolating function must satisfy the 
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same functional relation as the successive terms of the original sequence. In fact, in 
Proposition 16 of his book, he attempted to explain why this must be so. Stirling then 
continued: 


Now I propose to find the term which stands in the middle between the first two 1 and 1. And 
since the logarithms of the initial terms have slowly convergent differences, I first seek the term 
standing in the middle between two terms which are sufficiently far removed from the beginning, 
for example, that between the eleventh term 3628800 and the twelfth term 39916800: and when 
this is given, I may go back to the term sought by means of Proposition 16. And since there are 
some terms located on both sides of the intermediate term which is to be determined first, I set up 
the operation by means of the second case of Proposition 20. 


Actually, Stirling worked with log), since the logarithmic tables were often in 
base 10. He therefore took the twelve known ordinates to be log jg TG + 23)) for 
z=cxtl, +3, +5, +7, +9, and +11 and used Newton—Bessel to find the value 
at z = 0. From this he found (11.5) and after successive division by 10.5,9.5,..., 
down to 1.5, he computed T° (3) to ten decimal places. After this amazing calculation 
Stirling commented, “From this is established that the term between 1 and 1 [referring 
to 0! and 1!] is .8862269251, whose square is .7853.. . etc., namely, the area of a circle 
whose diameter is one. And twice that, 1.7724538502, namely the term which stands 
before the first principal term by half the common interval, is equal to the square root 
of the number 3.1415926... etc., which denotes the circumference of a circle whose 
diameter is one.” 

Here Stirling gave the value of (3), obtained from T(3) — 51°(5). His value 
for r( 3) is incorrect only in the tenth decimal place; when rounded, the tenth place 
should be 5 instead of 1. 


17.3. Euler’s Integral for the Gamma Function 


In his 1730 paper “De Progressionibus Transcendentibus,” Euler wrote*> that he found 
the infinite product 


1.2” gl-m , 3m 31—m _ qm 4gi-m . 5m 
l+m 2+m 34m 44m 


(17.21) 


as an expression that reduced to m! when m was a positive integer. Note that the 

product of a finite number of factors of the numerators, 2” - DO BMP BUM cos 

n” -n'-™ is n!.Ifn > m then 
ni=m!(l+m)(2+™m)---(n—m+m); 


so that the product (17.21) reduces to m! because of the cancellation of the denomina- 
tor terms with the terms in the numerator following m!. 


25 Bu. 1-14 pp. 1-24. E 19. 
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Euler then observed that for m = + (17.21) reduced to 


2-4 4-6 6-8 
/ (17.22) 


aS Se ag 


Euler remarked that when he obtained this result, he recalled having seen in Wallis’s 


: i ‘ _ : 1 
work that its value was ae Wallis was evaluating, in modern notation, ye (1—x?)2dx. 


Euler was thus inspired to consider integrals of the form 


1 
/ x°(1 —x)"dx. 
0 


Next, by the binomial theorem, from 


x°(l—x)" =x° (1 nx mee —_ ) 


Se ae a@a— 1) eye a(n — 1) — 2) 043 
1-2 1223 


term-by-term integration gave him 


i : , 1 n nn—1) n(n—1)(n —2) 
x©(1 — x)"dx = fess 
0 e+1 1-(e+2) 1-2(e+3) 1-2-3(e+4) 

(17.23) 


Euler evaluated the series in (17.23) for n = 0,1,2,3 to obtain 


1 1 28. 1.2.3 
e+1 (e+ 1)(e+2) (e+ D(e+2(e+3) (e+ Dlet2)(e+3)(e+4) 


He concluded that 


1 DBR 
(e+ 1(e+2)---(e+n) 


1 
(e+tn+ vf x°(1—x)"dx = (17.24) 
0 
For a modern mathematician, (17.24) would require an inductive proof, but during 
the eighteenth century, mathematicians found this reasoning sufficient. 


Wishing to make the denominator of (17.24) equal to 1, Euler set e = 4 and rewrote 
(17.24) as 


1 1 fF jlo ee 
oer | x¥(1—x)"dx = (17.25) 
ere (f + s)(f +28)---(f +138) 
He next effected a change of variables, replacing x by x Te, to find that 
(a 
1 P{i-x- *) Li Diese st 
es dx = enieee (17.26) 
frg Jo g" (f + g)(f + 28)---(f +18) 
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He could then obtain n! as an integral by taking f = 1 and g = O in the integral in 
(17.26); in that case, he observed, the integrand in (17.26) became 


Lag 
(a) 
In modern terms, Euler meant that the integrand was 
1—x8\" 
lim ( ) . 
g—0 g 


Euler stated that the limit could be evaluated by a well-known rule; he was referring 
to l’H6pital’s rule. Thus, in our modern notation, since 


. x§ Inx ~ Ll-x§ 
—Inx = lim — = lim ; 
g-0 1 g—>0 g 


he got 
1 
/ (—Inx)"dx =1-2-3----n. (17.27) 
0 


Euler assumed that because the product (17.21) and the integral in (17.27) both 
interpolated the sequence of factorials, they must be equal. With this assumption, he 
could write (17.7): 


1 
/ (—Inx)2dx = VE. 
0 2 


At the end of the paper, he explained how the gamma function could be used to 
define fractional derivatives. This was a problem already raised by Leibniz, as Euler 
may have been aware. He observed that when n was a nonnegative integer, 

qd” ze 
—— =e(e—-1)---(e-—n+1)2°"; 
pe een a ) 
and so one could define, for any positive real number n, 


ane. fi Inx)°dx me 


—n 
= (17.28) 
dz" (nx)e-"dx 
By taking e = 1 andn = - he had 
1 
d2 
ee (17.29) 
dz2 me 


Euler did not do any more with this concept, later rediscovered by Abel who in 
1823 applied it to more succinctly express the solution of an integral equation.” 


26 See Smith (1959) vol. 2, pp. 656-662 for an English translation of Abel’s paper. 
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In 1832, Liouville showed?’ that the integral equation considered by Abel could be 
easily solved by means of the fractional derivative. 

Several proofs of (17.11) follow immediately from formulas appearing in Euler’s 
papers. He did not explicitly state this result very often, perhaps because he had not 
developed a convenient notation for (x). In his simple proof published in 1772,78 
Euler used the products for (x) and sin zx: He let 


1.2m gi-m 3m 3l—m gm 


[m] = - ete. 
l+m 2+m 34+m 
Then 
1.27 gitm _3-m 3l+m .q-m 
[—m] = . . - etc., 
l-m 2-—m 3-—m 
[m][—m] : z: S: t de (17.30) 
— . . - etc. = : . 
nee l1—m2 22—m? 32—m? sin mm 


Note that (17.30) followed from the infinite product for sin x. Here Euler used the 
symbol [m] for (mm + 1), but its use was merely provisional. Observe that equation 
(17.30) is identical with 


Pd+mrd—m)=— , 
sin 7m 

an equation that we can see is equivalent to (17.11). The result (17.30) was first stated 
by Euler without proof in a 1749 presentation to the Berlin Academy of a paper 
eventually published in 1768.7? Euler finally published a proof in his 1772 paper. This 
result offers an example of the complexity involved in tracing and dating mathematical 
results, especially those to which Euler contributed. 


17.4 Euler’s Evaluation of the Beta Integral 


Euler did not prove (17.10) in his first paper on the gamma function, but the 
second paper, “De Productis ex Infinitis Factoribus Ortis,’” presented to the Petersburg 
Academy in 1739,°° contained an argument upon which a proof can be worked out. 
He started with the observation that 


m+(p+q)n 
m 


1 1 
i} x1] — x") 9 dx = / xm™ng-l x4) Gx. (17.31) 

0 0 
Wallis had stated a similar functional relation without proof, but with the develop- 
ment of calculus, this relation could easily be proved by using integration by parts. 
In his 1739 paper, Euler wrote that the result (17.31) was easy to prove, and when 
he returned to the subject over three decades later in his integral calculus book, 


27 Liouville (1832). 

28 Eu. I-17 pp. 316-357, especially § 43. E 421. 

29 Eu. I-15 pp. 70-90, especially § 13. E 352. 

30 Eu. 1-14 pp. 260-290, especially pp. 282-284. E 122 § 38-45. 
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he provided the details,*! revealing his explanation of the technique of integration 
by parts: He supposed 


[a — x8)"dx =a f xf'a — x8)"—ldy + Bxf (1 — x8)". (17.32) 


Euler then took the derivative to get 


xf—ld — x8)" = Axf—ld — x8)""! — Bmgxft8-1 — x8"! 


+ Bfxt (1 — x8)", 
or 


1—x% = A— Bmgx® + Bf(1— x’) = A— Bmg + B(f + gm)(1 — x8). 


Thus, 
A— Bmg=0 and B(f +mg)=1 


or 


1 


mg 
A= ——°— and B = ——_. 17.33 
f+em f +mg : 


Next, equation (17.31) followed from (17.32) by choosing the parameters appro- 
priately. Euler applied the functional relation (17.32) infinitely often to arrive at 


1 P 
eT) 2x 
0 


_ (m+ (p+ ayny(m + (p + 2q)n) + (m + (p + 00g)n) 
m(m +nq)(m + 2nq) ---(m + cong) 


1 
x | xmtoong—lq x4) dx: (17.34) 
0 


we note that the infinite product diverges and the integral on the right-hand side 
vanishes. We can, however, define the right-hand side as a limit. Let us continue to 
go along with Euler, who followed Wallis again by taking another integral similar to 
the one on the left-hand side of (17.34) but which could be exactly evaluated. He took 
m = ngq to obtain 


1 
1 = / x4lq — x") F dx 
(p+q)n 0 
es (nq + (pt+q)n)(ng + (p+ 2q)n)--- (ng + (p + 00qg)n) 
ng (2ng)(3ngq) --- (oong) 


1 
x | xngtoong—1¢y — x") 0 dx. (17.35) 
0 


31 Bu. L-11 chapter 2. E 342. 
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Euler then observed that if k was an infinite number and a finite, then 
1 P 1 Pp 
/ xK(L =x") ¢dx = i} xo — x4) a dx. (17.36) 
0 0 


So, dividing equation (17.34) by equation (17.35), the integrals on the right-hand 
side cancelled, and the result was 


1 Dp 
/ 5 mae Oe x") 9 dx 


0 
— 1 ngim + (p+ gyn): 2ng(m + (p + 2q)n) - 3ng(m + (p + 3q)n) 
(p +q)n m(p + 2q)n(m + nq)(p + 3q)n(m + 2ng) --- , 


Replacing n by - the form of the relation became 


1 P 
[ ea-aytax 
0 
q_ lang +(p+q)n) 20mg + (pt+2q)n) 30mg + (Pp +3q)n) 


~ (ptqyn-m(p +24) (m+n)(p+3q) | (m +2n)(p +4q) 
(17.37) 


This was Euler’s final result, from which we can derive (17.10), although we have 
the benefit of hindsight. Observe that if we set n = g = | and then replace p by n — 1, 
we get 


[era -nt taxa 1-(mtn)-2-(mtnt+1)-3-(mtnt2)--: 
0 ~ nem-(n+1)-(m4+1)-(n+2)-(m4+2)--- ” 
(17.38) 


Thus, in his 1739 paper, Euler did not prove (17.10), but merely showed that the 
beta integral could be written as a quotient of infinite products. In a paper of 1766,°7 
he derived some properties of the beta integral. He started with the formula obtained 
from (17.37) and replaced m by p, p by g —n, and q by n: 


1 
/ Ce x" )nldx 
0 


_ P+. n(p+q+n) 2n(p + q+ 2n) 3n(p +4 + 3n) - 
~ pq (ptn\(qtn) (pt2n\(qt2n) (p+3n\(q+3n) ~ 


(17.39) 


Euler denoted the integral in (17.39) by (2): although its value clearly depended 
also on n, Euler’s notation did not take account of this. Note that when n = 1, this 
integral is the beta integral B(p,q), in Legendre’s notation. Euler’s first observation 
was that the right-hand side of (17.39) showed that the integral was symmetric in p 
and q, that is 


32 Eu. 1-17 pp. 268-288. E 321. 
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aa) 


n n-2n 2n-3n 3n-4n , 
= f a . - etc. 
p(n—p) (n+ p)Qn—p) (n+ p)3n—p) (n+ p)(4n—p) 
1 nn 4nn Onn 


p nn—pp 4nn—pp 9nn— pp” 


comparing with the product formula for sin x, (15.20), we obtain with Euler 


P _ {7 P\_ a4 
n—p) — Pp ~ nsin 22’ 


a result Euler also discussed in an earlier paper on the beta integra 
After this, Euler devoted the remaining portions of his paper to working out some 
particular cases, when n = 1,2,...,9. 
Euler did not explicitly evaluate the beta integral in terms of the gamma integral 
until 1772, when he wrote a paper** dealing with the integral 


1 
/ xf dnx)ndx. 
0 


1.33 


An unusual feature of this paper was that Euler explicitly specified when he was 
using a variable that took integer values and when he was using a variable taking 
arbitrary real values; he did not do this in his earlier papers on the gamma function. 
He began the paper with the theorem that if n denoted a positive integer, then 


fo a-x)ar= 2 pian (17.40) 
0 ff (fF +a(f +28)---(f +g) 


To prove this, he first performed the integration by parts calculations given in 
(17.32) through (17.33), and then wrote that this implied that 


1 
fr! 8\n ay — 
xI~* — x®)"dx = 
I f +ng 


To obtain (17.40), he iterated (17.41) n times and used the fact that 


1 
/ xf — x8)"-ldx. (17.41) 
0 


, f-1 gy)0 1 
1— dx = —. 
/ x ( x®)" dx 7 


0 


33 See Eu. I-17 pp. 233-267. E 264, § 45. 
34 Bu. 1-17 pp. 316-357, E 421. 
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In section 22 of this paper, Euler stated and proved what he called a general 
theorem: If 7 is a positive integer and f and g denote any positive numbers, then 
Leo Seun 
(An + 1)(An + 2)-+-(An +n) 


i xt 8) lx 
i xf — x8) A+Dn-1gy 


d ae 
= ng / gt TEN = 8h 
n+1 0 


(17.42) 
Euler proved this theorem by using (17.40) to find the values of integrals on the 
right-hand side of (17.42). In corollary 1 of this theorem, he substituted the integral 


Xn 
A+1 


1 
xl x) lax 
0 


for the left-hand side of (17.42). 
Next, in section 24, Euler made a remarkable claim: This theorem holds, even if n 
is not an integer. Since A is arbitrary, An can be replaced by m, to obtain 


tA pia Mee so amen be 
is xf-lqd — xgymtn-ldy 
(17.43) 


1 1 
i x™ lq x)" ldx = a xftme-liy — x8)y"—lgy . 
0 0 


Euler’s reason for this conclusion appears to be that, though the theorem had 
been proved for the case in which n was an integer, the integrals made sense in the 
case where n was any positive real number, and hence the result must be true for 
any positive real number. Legendre followed Euler very closely in his exposition of 
the Eulerian integrals; in the 1811 first volume of his Exercices de calcul intégral, 
Legendre gave an additional comment on this point.*> On page 279, he wrote that 
both sides of (17.43) equal the same function of n and m = An: 

1 n—1 1 (n — 1)(n — 2) 1 


An 1 Anti eee) “Tee. 


(17.44) 


Legendre’s point was that in (17.44), n and An need not be integers and could 
be replaced by any positive real numbers. Therefore, the extension of the formula 
from positive integers to positive real numbers made sense to him. To continue Euler’s 
argument, he next observed that 1 — x8 = g In + when g = 0, so that, setting f = 1, 
(17.43) gave him 


1 
/ x™ lq —x)"!dx = (17.45) 
0 


35 Legendre (1811-1817) vol. I, p. 279. 
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Note that from (17.6), 
1 1 s—l 
T(s) = / (1 -) dx, 
0 Xx 


P(n) (mm) 
T(m+n)- 


(17.45) can be written as 


1 
/ x™-lq — x)" dx = 
0 


Concerning the extension of formulas from integer values to real or even complex 
values, we recall that Euler had made a similar argument in proving the formula (4.18): 


(+e +G)G)= Ce") 


This argument of Euler and Legendre is an early example of George Peacock’s 
flawed principle of permanence of equivalent forms. George Peacock stated his 
principle in his book on symbolic algebra, the second volume of his algebra book:*° 
“Whatever algebraical forms are equivalent, when the symbols are general in form but 
specific in value, will be equivalent likewise when the symbols are general in value 
as well as in form.” According to such a principle, since the two sides of (17.43) 
are identical with algebraic form (17.44) when n and An are integers, they remain 
equivalent when m and An are general, that is, real or complex values. It is surprising to 
see Peacock make a claim of this kind after Gauss and Cauchy had already clarified the 
matter. See, for example, our Chapter 4 for Cauchy’s proof of (4.18), and Chapter 23 
for Gauss’s remarks on analytic continuation. 

In fact, Fritz Carlson’s uniqueness or interpolation theorem is required to generalize 
from integers to all real or complex values in the half-plane Re z > 0. Carlson’s 
theorem?’ states that if: f(z) is analytic in Re z > 6 > 0; f(z) = O(eKl), k <7; 
and f(z) = 0 for z = 1,2,3,..., then: f(z) is identically zero. 

For the case here at hand, it would be sufficient to assume that f(z) was bounded. 
Atle Selberg gave a simple proof for this case using only Cauchy’s residue theorem.*® 
To prove Carlson’s theorem in general, the machinery of the Phragmén—Lindelof 
theorems is required. To apply Carlson’s theorem here, first observe that the formula 


1 
/ gi ety a ee ae 
0 T'(m +n) 


holds when n is a positive integer; the integral is bounded and analytic in Ren > 
5 > 0, as is the right-hand side. Thus, by Carlson’s theorem, the formula holds when 
n is complex and Ren > 6 > 0.°? 


36 Peacock (1842-1845) vol. 2, p. 59. 

37 Carlson (1914) or see Titchmarsh (1962) sections 5.6-5.8. 
38 Selberg (1944). 

39 Andrews et al. (1999) pp. 110-111. 
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17.5 Newman and the Product for T(x) 


Francis W. Newman (1805-1897) earned a double first class from Oxford University 
in 1826 and was elected a fellow of Balliol college the same year. He was a professor of 
Latin at University College, London, when he published his insightful 1848 paper on 
the gamma function.*° Reflecting the breadth and depth of his scholarship, that same 
year Newman published A History of the Hebrew Monarchy and, a year later, The Soul, 
Her Sorrows and Aspirations. His interests included political economy, grammar, and 
languages. For example, in 1872, he published a two-volume dictionary of modern 
Arabic. 

In the course of reworking Euler’s formula (17.38), Newman arrived at the result: 
For any real or complex number m 


em = m\-1 m 46 
r a (1 ) E, 17. 
(m) = ——]] 7) @ (17.46) 
k=1 
where y was Euler’s constant defined by 
dies th 1 
= li 1+<-4 Fata Ink). 17.47 
4 sim, ( eae k ) ree 


The proof that this limit exists is given in our Chapter 20. This representation of 
the gamma function is sometimes attributed to Weierstrass, who defined the function 
by a similar infinite product in 1856. In fact, the German mathematician Schlémilch 
discovered (17.46) even earlier, in 1843;+! his proof appears in the exercises. Newman 
wrote the product in (17.38) as 


mtn 1+(m+n) 1+4(m+n) 1+ }(m+n) 


mn “Dae ae A 1+4m-1 cea tm: 1 Lh 


and then observed that the product 


1 1 
ym) = m( 4 m(1 | 5) (1 | wn 
was divergent because its logarithm was 
1 1 1 1 1 2 
Inm+j{1 fe-+ |m 14 fe-- }m 
2 3 2 ae D) 
1 1 3 


and the coefficient of m was a divergent series. To remove this defect, he defined a 
new product, 


40 Newman (1848). 
41 Schlémilch (1843). 
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lt+m 14 sm 14 zm 
m-: : : i82%.s (17.48) 


whose logarithm did not have this divergent part, so that the product converged. 
Newman denoted the left-hand side of (17.38) by B(m,n), although Newman 
actually used F instead of B, so that he could write 


w(m +n) 
wim)y(n)’ 


where the ys now represented convergent products, that is, (mm) was defined by the 
product in (17.49). He then changed m to m + k in (17.49) to obtain 


B(m,n) = (17.49) 


Bam +k,n) = Wim +k +n) +(vim+kKv)), 
whence 


Bim,n) v(m +n)w(m +k) 
Bim+k,n) w(mv(m+n+k) 


(17.50) 


Since interchanging n and k did not change the right-hand side of (17.50), Newman 
found out that 
Bim,n) B(m,k) 


B(m+k,n) Bin+n,k) (17.51) 


Now a change of variables t¢ to 1 — t showed that 
1 1 
B(m,n) = i ames @ Wee 8 lamers if ”—-!d —1)""'dt = B(n,m); 
0 0 


hence by (17.51) 


_ B(k,m)B(k + m,n) 
B(m,n) = Ben) : (17.52) 


Newman let k — oo and used (17.36) to obtain B(k + m,n) = B(k,n) so that he got 


‘ _ Btk,m) Bk,n) c= 17.53 
OTe aii 


At this point, Newman remarked that (17.53) “had a close analogy” with (17.49) 
and this led him to further investigate B(k,m) when k = oo. He set t = 4 to find 


1 love) y k 
Bk-+ 1m) = Bonk +1) = f m1 -nfar= f (1 =) yl mm dy 
0 0 k 


CO 
= kn | ey" dy. 
0 
(17.54) 
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Following Legendre, Newman denoted the integral by ['(m). He then applied 
(17.54) in (17.53) to obtain 


_ Tarn) 


Bim,n) 


Comparing (17.49) with (17.55), Newman concluded that P(m) = x (m) (wv (m)) a 
where x represented an unknown function. Substituting this in (17.55) and comparing 
with (17.49), he saw that 

x(m +n) = x(m)x (n), 
an equation solvable by differentiation or by even more elementary methods to get 


x(m) =e”, 


with y an unknown constant. Thus, he could write 
eyv™ oa m\-! m 
r = ( 1 ~) KE, 17.56 
ae II vey. 2 ( ) 


To determine the value of y, Newman took the logarithm of equation (17.56) 
to find 


log I (m) = —logm — ym + (m — log(1 + m)) 4 (5 log (1 Sy) 


and then set m = | to discover that 


1 a 1 1 
= lim (1 —log2) 4 1 ee log (1+ - 
Yo 108 2) (5 og 5) (; og ( +7)) 

= ili oes a ] k+1 
Bee a ae ae 


Hence, y was Euler’s constant. 

In this way, Newman started from the beta integral B(m,n) = i, peop ae 
m,n > O, and obtained the basic formulas for the gamma function, based on the 
functional relation that Euler had found by integration by parts: 

BOn,n) = me" Bom + n,n). (17.57) 

Both Euler and Newman iterated the functional relation (17.57) to end up with a 
divergent product multiplied by a vanishing integral, as in (17.34). For this reason 
Euler needed another sequence of beta integrals, (17.35), that also tended to zero, 
multiplied by a divergent product, so that he could take the ratio and thereby 
find a finite nonzero value. See, for example, (17.38). However, Newman’s work, 
particularly equation (17.54), shows that it might be possible to avoid this difficulty. 
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To see this modification of the Euler-Newman argument, found by Richard Askey,*” 
first note that from (17.57) it is easy to obtain the functional relation 


m 


Bonn) = Ga 1): (17.58) 
n 


Iterate (17.58) k times and change variables as was done in (17.54) to arrive at 


Ro (m+n)\(m+tn+1)---(m+n+k-—1) 
fe ao nO 1) GR = 1) 


[ t m—1 t n+k—1 dt 
x 2 eae ae 
oo A k k 


(m+n)---(m+n+k-—1) ki Keo! 
kiketenl n---(n+k—1) 


k t k+n—-1 
x | m1(1-2) dt. (17.59) 
0 k 


Taking (17.4) as the definition of the gamma function, we can then take the limit in 
(17.59) as k — o6 to see that 


P(n) 


ak hea rarer nig 


ee lap. (17.60) 


Taking n = 1 in (17.60) and observing that [m7 +1) = mI'(m) and that B(m,1) = 
1 
=» we get 


(oe) 
T(m) = / er ae, (17.61) 
0 


Newman’s product for the gamma function can then be obtained directly: 


cae sue 
T(m) = lim 
k>oo m(m+1)---(m+k—1) 
km 
— ree 
Te m(1+ FY (1+ 8) (14 2) 
mlogk k-l | 
= lim © (1+ =) 
k>oo a S 
1 k-1 a 
=i See ele I] (1 + “) es 
k>oo Mm say S 
ere = 


42 Andrews et al. (1999) pp. 5-6. 
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Euler could have obtained (17.55) by using (17.4) in (17.38). Thus, the left-hand 
side of (17.38) is clearly B(m,n), leading us to 


’ s!(m+n)...(m+n+s) 
Bim,n) = lim 
seo n(n+1)---M+s)m(m+1)---(m+s) 
. ( sls” s! sl 
= lim : 
seo \n(n+1)---M+s) mim+1)---(m+s) 
eee) 


s! smtn 


lm) (m) 
T(m+n)- 


17.6 Gauss’s Theory of the Gamma Function 


Gauss’s work on the gamma function*? was marked by his systematic approach and 
a greater sense of rigor than was earlier practiced. He discussed the convergence of 
series and products but did not justify changing the order of limits, or term-by-term 
integration of infinite series. He started with the finite product 

1-2-3---k 


Th(k — ké 17.62 
os (c+) (c+2).-+hK ’ cue) 


where k was a positive integer and z a complex number not equal to a negative 
integer. He first proved that the limit as k — oo existed. For this purpose, he 
noted that 


1\ 74 12 \e * 
_ (1 = pt) (1 ~ Hn) 
I(k +n, z) = INK, z) a ; (17.63) 
I+ k+1 ee k+n 


and that the logarithm of the product written after I1(k, z) remained finite as n > oo. 
This proved that the limit existed. 

Wallis and Euler had perceived the significance of the gamma function for the 
evaluation of certain definite integrals. Gauss’s important contribution here was to use 
the gamma function to sum series; some early hints of this also appeared in the works 
of Stirling, Euler, and Pfaff. Gauss’s insight opened up the subject of summation of 
series of the hypergeometric type. Moreover, Gauss used (17.14) to establish the basic 
results on the gamma function. Interestingly, in this connection Gauss made use of 
two series discovered by Newton while he was a student at Cambridge: 


3 


. x” Xx 
resin x = t 
eS aes spr ae 


— 


43 Gauss (1813). 
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and 


n(n2 — 12)(n2 — 32) 
5! 


n(n? — 17) , 


sin nd = nsin@d 31 sin? 6 4 sin? 6 tee, 


where n was not necessarily an integer. Gauss’s analysis of the convergence of the 
hypergeometric series showed that the first formula was true for |x| < 1 and the 
second for |9| < 4. By contrast, Newton took a much more cavalier approach 
toward convergence questions. He is known to have discussed the convergence of the 
geometric series on one occasion, but his remarks contained no new insights. To write 
(17.14) in a compact form, we employ the modern notation for a shifted factorial: 


(a), =a(a + 1)(a+2)---(atn-—1), forn > 0, 
= 1, forn = 0. (17.64) 


We can now write Gauss’s formula in the form 


CO 
b (cdl (c-—a—b 
52 an _ MON e=a- 5) aes 
n\(c)n T'(c—a)P'(c—b) 
n=0 
when Re(c — a — b) > O. To obtain the value of r(5), Gauss took x = | in Newton’s 
series for arcsin x and used (17.65) to get 


d _rQyr@) _1 (r (3)) (17.66) 


rayrd) 2 


or 

1 
r (=) = /n. (17.67) 
He then derived Euler’s reflection formula (17.11) from (17.65) by taking 6 = 4 in 


Newton’s series for sin n@ where n was any real number. In that case, 


_ nw nin—l)(n+1) n—Dat+ln-—3)(n+3) 
sin =n ee 


2 2-3 2-3-4-5 


= ; (17.68) 
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Note that in the last two steps, (17.67) and (17.65) were employed. Gauss next set 
x = 5 in (17.68) to obtain 


IU X 


Trd+x)rd —x)= (17.69) 


sin Xx 
or 


ro@)rd —x) = (17.70) 


sin wx 


Finally, to derive (17.10) from (17.65), he wrote 


1 
i x™"qa — x)! dx 
0 


1 = am 
-|/ een oes: ame 23 Lax 
0 2! 


1 n—1 (n—1)(n—2) (n—1)(n—2)(n —3) ae 


m mt+ti 2! (m + 2) 3!(m + 3) 
_ 1 [ ans Dae. Cane Dns 2m 1. | 
m 


m+1 2! (m + 1)(m + 2) 
1 Pantltn-1-mrim+l) Pony) 
~m Tim+l+n—Dr(m+1—m) Pimtn)’ 


(17.71) 


Gauss derived the integral representation for the gamma function by setting y = nx 
in (17.71), where n was an integer, to obtain 


n 1! j,,m-1 
Pee y\n nin 
ase Bip 5; 
[> ( ) ¥= mm +1)---(mtn—D 


The limit as n — oo gave 


CO 
| y"leYdy =T(m), 


a result also due to Euler. Gauss did not justify that limp. as = ni limy-+50 in this 


situation. 
Gauss also defined a new function W(z), given by 
d M(iz+) 
Wiz) = —InT(zt+hH= : 17.72 
My dz Pet) 


He observed that Y(z), the digamma function, was almost as remarkable a function 
as I'(z) and noted some of its more important properties. See the exercises for some 


of these properties. 
In section 26 of his 1813 paper, Gauss gave an elegant proof of the multiplication 


formula (17.15). First set 
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k! ke! 


T(k,z) = ; 
ae 2(z+1)---@+k—-1) 


Gauss started by observing that for a positive integer n, the expression 


n™D(k,z + 1) (kz+1—4)0(k,z+1—2)---P(kz+1- =") 


17.7 
T(nk,nz + 1) ( 


could be reduced to 
(k! yr nnk 
(nk)! k= 


and hence was independent of z. Therefore, the value of the expression (17.73) was 
equal to its value at z = 0, that is, since P(A, 1) = P(nk, 1) = 1, 


1 2 n—1 
rk, 1) P(k,1—-—)P(k,1——)---Tlk1—- 
n n n 
1 2 1 
=T(k,l—-)VP{k,1——]---T(k,—-). (17.74) 
n n n 


He then let k — oo in (17.73) and (17.74) to find that 


mre + (z+1-2)--P (242) 
P(nz + 1) 


“()rQ)o(-2e(-2) eos 


To determine the value of the right-hand side of (17.75), Gauss multiplied it by 
itself in reverse order, and used the reflection formula (17.11) to get 


Gee alee ae 


a a a = (27)?! 


eg ms x (17.76) 
sinZ sin2= sin (DT n 


Next, following Gauss, write "(z+ 1) = zI'(z) and (z+ 1) = nzI' (nz) in (17.75). 
Then, use (17.76) in the form 


Qe) 


to obtain the multiplication formula. 
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17.7. Euler: Series to Product 


In a very interesting paper,*4 “De mirabilibus propriatibus unciarum quae in evolu- 
tione binomii occurrent,” Euler stated two theorems, given here in modern notation. 


Theorem | 
: . . 1 2n 
! n I n | n ; a 2 : 2n x dx, 
14 ) (5) (5) ae 2 i =e (17.77) 
Theorem 2 
n\ (n' n\ (n' n\ (n' 1 
14 @, @ + (5) 2) a o) (3) —— neue (17.78) 


n 
k 
modern notation. Of course, the first theorem is a special case of the second, though 
this may not be immediately obvious. But to see this, observe that an application of 
(17.10) shows that the right-hand side of (17.78) is equal to 


In this paper, Euler wrote for ( al so that his notation here is very similar to our 


Tiantn’ +1) 
Tatra’ +i) 


(17.79) 


By making the substitution y = x? in the integral in (17.77) and using (17.10) and 
(17.7), we perceive that the right-hand side of (17.77) can be rewritten as 


ea (n + 3) 
JaT(n+1)- 


Finally, it follows from Legendre’s duplication formula, i.e. the formula for the case 
n = 2 in (17.15), that when n = n’, the expression in (17.79) is identical with that in 
(17.80). 

In fact, Euler was unable to prove his two formulas (17.77) and (17.78). After 
stating theorem 1, he wrote that he found it remarkable that there was no direct way to 
prove it in general. Instead, in his paper he verified a large number of particular cases 
and did the same thing for theorem 2. In particular, he checked the case for theorem 
2 in which n’ = —n,0 <n < 1. His verification here is highly interesting because 
he showed how a particular sum can be converted into a product. First note that for 
n’ = —n, (17.11) shows that (17.78) is equal to 


(17.80) 


1 sin n7t 


Td+n)ld—n) oan 


44 Bu. 1-15 pp. 528-568. E575. 
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But Euler evaluated the integral in (17.78) for n’ = —n in a different manner. He 
transformed the integral into 
ie yt" -1 dy 
i eS 


the value of which he already knew from his work on the integration of rational 


functions: ar In this connection, see (17.13). Thus, in section 44 of his paper, 


Euler found that he needed to verify the formula 


; n2 | n2(n2—1) —n2(n2 — 1)(n? — 2?) Si sin a7 (17.81) 
12 12.22 12.22 . 32 nw 
in which the series is obtained after taking n’ = —n on the left-hand side of (17.78). 


To verify the sum in (17.81), he denoted the expression on its left-hand side by S and 
observed that 1 —n? was a common factor of the sum. Dividing by this common factor, 
he arrived at 


S n> n?(n2?—27) n?(n* — 27)(n? — 3”) 


[og 2 wae eae OR 


2, , s : 
One can now see that | — 52 is acommon factor in the sum in (17.82). Thus, rewrite 


the sum contained in (17.82) as 
S n> n*(n* — 37) 
Q-2)(1-%)  # 12.32.42 


(17.83) 


2. : A : : 
where 1 — 4; is clearly the common factor. Repeating this process infinitely often, 


Euler found that 


since the product was sin 7x , Euler had succeeded in verifying the case in his theorem 2 


for n’ = —n. Observe that in moving from the series in (17.81) to the product in 
(17.84), Euler employed (15.69) in reverse, that is 


1—a— Bd —a)—yd—a@) — B)- 6 —a)— Bd — y) 
= (=a) = 6) yd = 6) 245 


Based on this work of Euler, we point out that if (17.81) could be proved by a 
different method, we would have a new and remarkable proof of the product formula 
for sin 7x. 

It turns out that (17.78) is a particular case of Gauss’s summation formula (17.65). 
Observe that the series on the left-hand side of (17.78) may be written as 
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re (=n)(=n') | (=ny(—n tt D(-nln' +1) 
cnn |e bee 212! mos 


use the notation defined in (17.64) to rewrite as 


~ (=n)c(=n')e — TU) P@+n' +1) 
Md). T@t)Dr@’4+) 


k=0 


This completes our proof of (17.78) and therefore also of (17.81). 


17.8 Euler: Products to Continued Fractions 


In his 1739 paper “De fractionibus continuis, observationes,”*> Euler investigated 
the relation of certain infinite products with continued fractions. For example, he 
considered the infinite product 


P(p +24 +r)(p + 2r)(p + 24 + 3r)- +: 
(p + 2q)(p +r)(p + 2g + 2r)(p + 3r) ++ 
This and other similar products had earlier appeared prominently in Euler’s work 


on the gamma and beta functions. In this paper, Euler noted that the product (17.85) 
was a ratio of beta integrals: 


(17.85) 


1 = a1 
Io yea lq = y?") 5 dy 
fo y?- 1 = y")-2 ay 
In an analysis quite similar to one carried out by Wallis, he associated a sequence 


Ao, Ai, A2,... (he wrote A, B,C, ...) with the product. The sequence was defined by 
the relations 


(17.86) 


py 4 
Rie 8, i en en HR 
p+2q+2r p+2q+4r 


Pp 
AogAj = ‘ 
; p+2q 


He added the requirement that 


3 5 
Gig ae fee eens, Pi Cena ca (17.88) 
pt2q+r p+2q+3r p+2q+5r 


Observe that the relations (17.87) and (17.88) gave Euler 


on PD Ly 22) pt+2q+r y 

a ree Ay 
pt+2q A; p+t+2q Prt 

Dp p+2q+r p+2r 1 

p+2q ptr pt2q+2r A3 


45 EB. 1-14 pp. 219-349. E123, § 21-24. 
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Euler desired a continued fraction representation for the infinite product Ao, 
given by (17.85). To eliminate the denominators, Euler defined another sequence 
do, 41,a2,... by the relations 


ao ay a2 a3 
Ag = ———_,, A = —— , A2. = ——__,, A3 = ——_.. «'-,, 
pt+2q-r p+2q pt2q+r p+2q+2r 
(17.89) 
so that 
aga, =(p+2q—-r)p, aja2=(p+2q)(p +r), (17.90) 
ana; =(p+2q+r\(p+2r),---. (17.91) 
He then set 
1 1 
a=m—-r+—, a=m+—, (17.92) 
al a2 
1 1 
ag=m+r+—, aga=m+2r4+—,-::: (17.93) 
03 a4 


to obtain a continued fraction for ag dependent on m. Thus, Euler potentially had an 

infinite number of continued fractions depending on the parameter m and each of these 

continued fractions thus had the value ag. He then chose special values of m such as 

p—?r, p+4q, p+ 2¢ to obtain several interesting continued fractions for specific ao. 
To simplify the equations satisfied by a1,a@2, a3, ..., Euler set 


P= p(p+2q-—r)—m(m—r) and Q=2r(p+q-—m). 


Then he could write 


Pajan —(m—r)a; = maz +1, (17.94) 

(P + QO)o203 — man = (m+ r)a3 +1, (17.95) 

(P + 2Q)a3a4 — (m+r)a3 = (m+ 2r)a4 + 1, (17.96) 
(P+ 3Q)agas — (m+ 2r)a4 = (m+ 3r)as + 1, (17.97) 
in beat , (17.98) 


relations that are easy to obtain. Euler did not give the details; for the convenience of 
the reader, we indicate how they may be derived. First, multiplying the equations in 


(17.92) produces 
ta) e) 
agai = |m-—-re m+ : 
A a2 


Then, using the first equation in (17.90), we have 
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1 1 
pip +2) =(m r+ ) (ms ) 
a1 a2 


P(p + 24 —r)ajaz = m(m — r)ajan + (m —r)a; + man +1 


or 


or 
Pajar —(m—r)a; = maz +1, 


the first of Euler’s relations, (17.94). The second, (17.95), can be verified in a similar 
manner by multiplying the second equation in (17.92) and the first equation in (17.93) 


to find 
1 1 
aja = (m | ) (m tr- ): (17.99) 
a2 03 


combining (17.99) with the second equation in (17.90) then produces 


(p + 2q)(p + r)o203 = m(m + r)a203 + maz + (m +r)o3 +I, 
leading us to 
(P + Q)a203 — maz = (m+ r)ja3+1, 


or (17.95). Thus we see how to derive Euler’s relations (17.94) through (17.97) and so 
on, from which he deduced 


maz + 1 m | p(p +2q —r): P? 
“Pe (n=) PF MP +e 
fe A OR (p+r)(p +29): (P+ 9)? 

(P+ Q)az3-m P+Q —m:(P+Q)+a3 
(m+ 2r)a4 +1 m+2r | (p+2r)(p+2q +r): (P+2Q)° 


TPO aa) PAO (m+r):(P +20) +a4 


x 
ay , Where x:y=-, 
y 


’ 


a3 , etc. 


To write the resulting continued fraction for a; in simpler form, Euler set 


R=p* +2pq—mp—mq+qr and S=pr+gqr—mr 
and obtained the continued fraction for a1 as 


oy a PF 2a =F) P? (wt rp +24): (P + OY 
<P ' OR: P(P+ O)+ 2r(R+S): (P+ OVP +20)4 
(p + 2r)(p+2q +r): (P +20) 


2r(R +2S):(P+20\(P+30)+ 


(17.100) 
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Euler’s notation may be easier to understand: 


piphig=)) 
P 
ie, | 
Gar P . (p+r)(p+2q) 
2rR (P+0)2 
P(P+ 2r(R+S) te 
(P+) (P+OVP+20) + 


He could therefore write down the continued fraction for ao after transforming the 
denominators in the fractions of a: 


1 
dag =m—r+— 
a) 
P 

TT Tp Ppteg- P+) 


(p+r)(p+2q) P(P+2Q) 
21 R45 RES) Fo 


or in modern notation 


P pip+2q—r)(P+Q) 


a=m-—-r- 


m+ 2rR+ 
(p+r)(p + 2q)P(P +2Q) (p+ 2r)(p+2q +r)(P + Q)(P+3Q) o. 
2r(R + S)+ 2r(R +2S8)+ 


(17.101) 


To derive the continued fractions related to Wallis’s product, Euler took p = 2g = 
r = 1. In this case the ratio of the beta integrals was 


Ap = 4) = ——~—— = 


the values of P, Q, R, S were, respectively, 1 +m m?, 3 —2m, ee oe, Thus, 


l+m—m? 17(4—m-—m?’) 
m+ 5 — 3m+ 
27(1 +m — m?)(7 — 3m — m2) 37(4 —m — m7) (10 — 5m — m?) 
8 —5m+ 11 —7m+ 


ag =m—14 


ea “CA FAOD) 


To get the continued fraction for 5 in a nice form, Euler took m = | in (17.102); 


he skipped a step that we fill in: 


2 1 
ao = 7 => 12.9 
I+ 24 27:13 
a 32.2.4 
a+ 
1 
= I 
rere: 
l+Te 
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Thus, Euler had 4 as the reciprocal of ag: 


Lae 1 1x22x33x4 (17.103) 
Saree oan) FS oe | ; ; 


Thus, the continued fraction produced by Euler’s method for the product for 2 was 
different from the continued fraction found by Brouncker. 


17.9 Sylvester: A Difference Equation and Euler’s Continued Fraction 


In 1869, J. J. Sylvester rediscovered Euler’s formula (17.103) while investigating 
a difference equation arising out of the successive involutes to a circle.4° The 
successive convergents of the continued fraction in Euler’s formula produced the 
partial products of Wallis’s infinite product. This infinite product was not useful for 
deriving approximations of zr, but Sylvester showed that its continued fraction could 
be modified to yield good approximations. The continued fraction representation often 
provides better approximations than other representations, as was noted by Euler in his 
first paper on the topic. In his work on successive involutes, Sylvester was led to study 
the difference equation 


1 
Unt+1 — Un—1 = —Up- (17.104) 
n 


He found two sequences as particular solutions of this equation: 


= - _ 2.4.6---2n eT 

Ai = » Bon = Ponti = 7375... p= Di Birask 
3-5-7---2n-—1 

O2n-1 = A2n = e ele is per 


ee ee ae 


From Wallis’s formula for 2, Sylvester concluded that 


From equations (3.27), (3.28), and (17.104), we see that Pa is the nth convergent of 
a continued fraction 


a, a a 
bit bot b3t 


bo 4 


where a, = | and by, = ‘ for n > 1. Thus, Sylvester could write that 


Pie A 1 shai, VE 2e8 BOE AS arses 
Dow Oe aS, oe nie es ek: Se 1 


46 Sylvester (1869). 
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To give a sense of Sylvester’s inimitable style, we quote the sentence immediately 
following the last continued fraction, from Sylvester’s paper: “This is obviously the 
simplest form of continued fraction for z that can be given, and yet, strange to say, has 
not, I believe, before been observed. Truly wonders never cease!” Though Sylvester’s 
result was of course not new, his method was original, and he also explained that this 
continued fraction could be used to improve the approximation obtained from Wallis’s 
product or, equivalently, from the Madhava—Leibniz series. 

Note that if u, is the remainder after m terms of the fraction, then 


n(n + 1) (at+)@t+2) | 


= 17.106 
Un I+ I+ ( ) 
and 
2 
pee, (17.107) 
14+ Un4+1 


This shows that uv, is unbounded as n — oo, and hence uyuny) © n* +n and 
un © n for large n. Thus, for large n we may write, following Sylvester, 


Lae 1 2 6 n(n — 1) 


af 17.108 
2 1+ 1+ 1+ l+n ( ) 


This correction by n at the end of the formula improves the nth approximant 
obtained from the continued fraction. Thus, Sylvester noted that the convergents for 
n=4andn =5 were 


64 384 
— = 1.4222 d —e=17 
5 an 75 056, 


while the corrected values given by (17.108) were 


128 352 
— = 1.5802 d — =1.5644. 
3] 5802 an 795 56 


For comparison, note that the actual value of 4 to four decimal places is 1.5708; 
thus, the continued fraction has an advantage over the Wallis product. 


17.10 Poisson, Jacobi, and Dirichlet: Beta Integrals 


The early nineteenth-century mathematicians, striving to better understand how to 
manipulate definite integrals, used them to give new proofs of already known 
properties of the gamma and beta integrals. Poisson, Jacobi, and Dirichlet showed 
how double integrals could be employed to evaluate Euler’s beta integral (17.10). 
It is interesting to see that Poisson made a change of variables in the double 
integral one variable at a time, while Jacobi changed from one pair of variables to 
another. Poisson’s derivation appeared in papers of 1811 and 1823;47 Jacobi’s proof 


47 Poisson (1823); especially see pp. 477-478. 
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was published in 1833.48 In 1841, Jacobi wrote his important paper on functional 
determinants or Jacobians? (so called by J. J. Sylvester’) from which arose the 
change of variables formula for n-dimensional integrals. Around 1836, the Russian 
mathematician M. Ostrogradsky gave a derivation of the general change of variables 
formula based on symbolic manipulation. 

In Polssom: s evaluation of the beta integral, we first observe that the substitution 


a TK gives 


1 le) q-1 
/ Pl — tat a) eee (17.109) 
0 0 (+s)Pt4 


Euler knew this, but Poisson also noted that the integrals converged if p and g were 
real and positive, or if p and g were complex with positive real parts. Poisson started 
by multiplying the integrals for "(p) and I'(q) to get 


CO (oe) 
(pq) = : / e*e xP! yt laxdy. (17.110) 
0 0 


He then substituted xy and xdy for y and dy, followed by 5 ey and 
x and dx, to obtain 


ew ty Pta— l\q-1 
r(p)P(q) = I he a A Ea 


aa ra in place of 


(+ yea 
0 0 U+y)Pra 
le) q-lq 
y y 
ee ease a 
PFD) Gayrs 


This proved (17.10). In 1833 Jacobi gave a different substitution. He set x + y = 
r, Xx =rvw, so that r ranged from 0 to o6 and w from 0 to 1. He then noted that 


dx dy =rdrdu, (17.111) 


so that the change of variables from x, y to r, w in (17.110) gave 


le) 1 
l(p)l(q) = e Trea! ar [ wed —w)t! dw. 
0 0 


He did not explain how he obtained (17.111), probably because he took it to be well 
known. 

We note that Poisson gave the conditions for convergence of the integrals, reflecting 
the increasing awareness among mathematicians that rigor was important. In fact, 
the works of Gauss, Cauchy, and Abel on infinite series contain the first significant 
expressions of this rigor. And Dirichlet was particularly attentive to questions of rigor 


48 Jacobi (1833). 
49 Jacobi (1841). 
50 Sylvester (1853a), especially Art. 65. 
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in his important work on integrals, both in papers and in lectures. For example, he 
discussed the conditions, such as absolute convergence, for changing the order of 
integration in a double integral. 

Dirichlet’s evaluation of Euler’s integral and his proof of Gauss’s multiplication 
formula using integrals are both given in the exercises at the end of this chapter. Here 
we mention a multivariable extension of Euler’s integral presented by Dirichlet to the 
Berlin Academy in 1839:>! 


Z 2 2 T'(a@1)V (a2) ++: (an) 
wee Phy nly dx ---dxn = Mee ils ae) 
/ / 8 aa (rare ae 


where a; > O and the integral is taken over the region 4 x < 1, x; > 0, 
i = 1, 2,...n. Note that this formula is an iterated form of Euler’s beta integral; in 
1941, Selberg discovered a genuine multidimensional generalization of Euler’s beta 
integral.** Also note that the use of Euler’s beta integral yields a new proof of Euler’s 
reflection formula. For this purpose, take g = 1 — p in the infinite integral (17.109). 
Then we have 


le) gpl 
P(p)Td — p) = ds. 
(p)PCU — p) y ian 
So if we verify that the value of the integral is aie we have our proof. Euler 


himself evaluated this integral in 1738 for p a rational number. See Section 12.6 in this 
connection. Dedekind included an improved and streamlined version of this method 
in his doctoral thesis of 1852.° He let 


eo) gat oo ym—l 
B = dx =a) —adx. (17.113) 
0 x+1 0 xr+) 


Then 
Feed mem (aera) (6 ae ang OCT (eer ela 


where ¢ = en By a partial fractions expansion, and applying (12.54), he obtained 


xml —] n pre 


en ta x — C2k-1" 


From this, Dedekind could conclude 
nf xml ae J EMD Jog gH — x) 
Pee | : 
k=1 
The last expression was easy to evaluate at x = 0 but not at x = oo. So Dedekind 


rewrote this expression as 


51 Dirichlet (1839b). 
52 Selberg (1989) vol. 1, pp. 204-213 and vol. 1, pp. 62-73. 
53 Dedekind (1930) vol. I, pp. 1-26. 
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aie n 
2 EME) jos (H = i) = logic). Ce, (17.114) 


k=1 k=1 


a sum that was zero because 


yr ema =m Soee2mye! _ em comm 0 
= =i mea, 


Thus, 


ta 2k-1 Ge 
n { ——ax= =e tos ( -1). (17.115) 


He next used this relation at x = oo and the previous one at x = 0 to get 


k=1 


= ne my" 


Dedekind remarked that his use of the second relation to evaluate the integral 
at co made his proof shorter than the ones found in integral calculus textbooks. It 
is interesting to compare Dedekind’s derivation with that of Euler, (12.68). Dedekind 
gave three other evaluations of this integral. One of these used Cauchy’s new calculus 
of residues. Another proof, included in the exercises, employed differential equations, 
and was an original contribution of Dedekind. In the third proof, given earlier by Euler, 
Dedekind expressed the integral as a partial fractions expansion. 


17.11 The Volume of an n-Dimensional Ball 


Dirichlet applied the integral formula (17.112) to find the volume of a unit ball 
in n dimensions. However, Dirichlet’s proof and his evaluations of other integrals 
involving surface areas in n dimensions omitted some details. Dirichlet published his 
1839 paper in Germany on multiple integrals;>+ he placed an even more brief proof 
in Liouville’s French journal.*> Liouville responded the same year with a paper in 
his journal,°° providing the necessary details; we follow Dirichlet’s argument, with 
details as presented by Liouville. 

Taking n = 1 in (17.112) gives 


Fa alg, = 1 Ta) 
0 a (+a) 

Assume the result true for n = k. For a; > O and ae, x < l,x; > 0, 
i =1,2,...,k, we then have the integral on this region 


54 Dirichlet (1839b). 
55. Dirichlet (1839a). 
56 Liouville (1839). 
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= Tr Tr ae a 
[ccf aoe? cap ee? ae (a1) I (a2) (ax) (17.116) 


D(lL+ a) +--+ +a%)) 


Now consider the & + 1-dimensional integral 


_ ay-l a=. yor 1 
re fe fis Xy Xp dx, +++ dXxXK41 


1 1l-x 1—x1}—x2—---—xK 1 1 1 
slr a2— Ok+1— 
ho af Xp Xy ee TR Axp41 Axp +++ dx 


l-x 1l-—x1—-—-+-—x, 
1 1 k-1 wn 1 we 1 is 
Xo to XE 
k+1 


xR K+ dxpdxp_1 +++ dx (17.117) 


and set x, = t(1 — xj — +++ — xx_1) to find that 


1-x, 1—x1,—-+++—xp_-2 1 
aj—1_a2—-1 a_1—l 
Rehan a 7 a 
Ak+1 


x (l—-—x,)—x2---- or — t)**+! dt dxp_1---dx 


1 rennet) fp. 


ayy Pag + axg41 + 1) 


L—x1—++-— X42 1 1 
ay — Ok —1— ata, 
i xy ty, tL Saks Sey ae ary. 
0 


(17.118) 


Observe that integrating (17.116) once with respect to x; produces 


1x] +X 4-1 eicl ee so 
[- f : xpey xp me xg) dg >> day 


x} teh | 


T(ai)---T@x) (17.119) 
RT Sig hae +a)’ . 


and from this we conclude that the value of the last integral in (17.118) must be 


PQ) ++ Peon + oK41) (17.120) 
inal tayt-:---taz- Ok+1) 


(a, + Ok, ) 


Substituting (17.120) in (17.118) finally gives us 


1 P@)P@xti +1) (e+ oni (1): P@e-1)T oe + O41) 
acs Pap + ox41 + 1) Pd +a; +--+ +ax41) 
—  T@id)---T@e+1) 
~~ PA +a, +--+ 41) 


—— 


completing our proof of (17.112). 
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Note that the region of integration for the integral in (17.112) is the region enclosed 
by x; > O and >°?_, x; < 1. By applying a change of variables, following Dirichlet 
and Liouville, we can evaluate the integral: 


(17.121) 


with conditions on V set as 


ud .\ Pl 
x, >0, a4 =0, pj > 0, i=1,...,n, (=) <1. 
i=l 


/\ Pi 
To reduce the integral in (17.121) to (17.112), let y; = (2) , so that 


dy; dx; 
= Piz 

Ji Xj 
Substitute in (17.121) to get 


a) a2 an a1 a9 a 

a, ay °°° Gn PI P2 : Pn ‘y d 
ee ee ee ne EI tng 
P1\P2°**Pn V 


where V is defined by u; > O and paar yj < 1; 7.121) is now verified. 
Thus, to find the volume V of the region enclosed by 


take aj = 1,i = 1,2,...,n in (17.121) to get 
y — Hier ) Pa) 
r(i+rn, 2) 


Formula (17.122) implies that the volume of the n-dimensional ellipsoid 


2 
ee (2) < 1 must be 


(17.122) 


(17.123) 
r(1+ 4) 
and the volume of a n-dimensional ball of radius r is 
nd 
ae (17.124) 
r(1+ 3) 
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Dirichlet and Liouville also computed the surface area S, defined by x; > 0, 


Dini = 1b: 


r oT 
Jef pie rere aay oe ce — (a1) (in) ; (17.125) 
Ss Tay +---+Tn) 


To sketch a proof, observe that S is defined by x, = 1 — x, — x2 — +++ — Xy-1, 
where x; > 0,7 = 1,2,...,n, and xj + x2 +---+2X,-1 < 1. Thus, the integral in 
(17.125) can be written as 


1 1-x 1—x1—-+-—Xp_2 
ay—-l Qy—1—1 
eee xy ae} “Xy-] 
0 JO 0 


(La xy mee = x)! xn + dx, 


and by applying (17.119), we get our result. 


17.12 The Selberg Integral 


We have seen that Dirichlet’s integral (17.112) is a generalization of Euler’s beta 
integral. In 1941, Selberg presented another generalization:*/ 


If n is a positive integer; a, 6, y are complex numbers such that 


Rea > 0, Ref > 0 


and 

; {; Rea = 

Rey > —min , , : 
nn-1l1n-1 
and if 
A@= [|] G=-p, (17.126) 

l<i<j<n 

then 


1 1% 
Si(a.B.y) = [ of a 
0 0 i 


-II* ee —)Dy)T(6+G-Dy)rd + jy) 
T(a+B+m+j—2)y)rd+y) 


(17.127) 


In 1941, Selberg needed the integral formula (17.127) in a paper on entire 


functions,°° in which he wrote that if the formula were new, he would publish 


57 Selberg (1941). 
58 Selberg (1944). 
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a proof elsewhere. And he published the proof, but in a popular mathematics 
magazine,’ concerning which he later remarked: 


This paper was published with some hesitation, and in Norwegian, since I was rather doubtful that 
the results were new. The journal is one which is read by mathematics-teachers in the gymnasium, 
and the proof was written out in detail so it should be understandable to someone who knew a 
little bit about analytic functions and analytic continuation. 


For his generalization, Selberg required a weak form of Carlson’s theorem, 
mentioned at the end of Section 17.4. Using only Cauchy’s integral formula, he proved 
that if f(z) was analytic and bounded for Re z > 0 and if f(z) = 0 for z = 0,1,2,..., 
then f(z) must be identically zero. We are fortunate that Selberg provided this nice 
proof for his audience of mathematics teachers; if he had been writing for mathematics 
researchers, he most probably would have simply cited Carlson’s theorem. We here 
give Selberg’s proof and later we present Askey’s proof of Selberg’s integral formula, 
that does not use Carlson’s theorem. We mention that G. W. Anderson in 1991 
gave a proof of Selberg’s integral formula using Dirichlet’s integral (17.112) but not 
Carlson’s theorem.°! 

For Selberg’s weak Carlson’s theorem proof,” note that it is a consequence of 
Cauchy’s integral formula, or Cauchy’s residue theorem, that 


f.62 


(a—1)(@—2)---@—n) fi f(@) dz 
= 17.12 
ae, ani ig C= Oe- eam. 
forn > a> 0. Fora > 1, (17.128) implies that 
if@| < [a]! (n — [a])! [. | f @t)| dt 
20 —_ VJ (a2 + 12). + 12)--- (n2 + £2) 
[a]!( —[a])! f° If GO| 
< al ie face dt. (17.129) 
The integral converges because f(z) is bounded and because a > 1: 
[a]! (a — [a})! 
7 >O an>o. 


Hence, by (17.129), f(a) = 0 for a > 1 and, by analytic continuation, f(a) = 0 
for a > 0. 

Most mathematicians and physicists were hardly aware of Selberg’s integral for 
at least thirty years. Some researchers in entire functions probably knew it, because 
Selberg included it in his 1941 paper on Gelfond’s theorem, and this paper was 
referenced in Boas’s well-known work on entire functions. In the early 1960s, 
physicists F. J. Dyson and M. L. Mehta conjectured a limiting case of Selberg’s 


59 Selberg (1944). 

60 Selberg (1982) vol. I, p. 212. 
61 Anderson (1991). 

62 Selberg (1944). 
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integral in their book on the statistical theory of energy levels of complex systems.°° 
Then in 1976, when he, Selberg , and Dyson were all at the Institute for Advanced 
Study at Princeton, Enrico Bombieri was led to consider integrals of the Selberg type 
in the course of his work on prime number theory. Perceiving his integrals to be 
similar to those used in physics, Bombieri went to consult Dyson, who referred him 
to Mehta’s book, Random Matrices. Bombieri next approached Selberg to discuss his 
problem on the distribution of primes, and Selberg recognized Bombieri’s integral as 
a more complex version of his generalized beta integral. 

Selberg’s original proof was complicated. An idea for a simplified proof was found 
by K. Aomoto in 1987.® In fact, Aomoto’s method evaluated a slightly more general 
integral. Thus, following Aomoto, denote the integrand in Selberg’s integral by 


@(x) = O(x;a,B,y) = I] sages! aa I] x; — x; |?” 


i=] l<i<j<n 


and set 


k 
k= f ] [ 1@@:@, 6. y)ax, 
C 


” j=] 


where C,, is the n-dimensional cubic and dx = dx ,dx2---dx,. Aomoto found a 
relation between J; and J,_; by observing that, since 


d Cc 
Si eway ee Es. dee ho 
dx x 


we can write 


k 
a 
o= |. TF (: -»)[] 009) dx 


i=1 


k k 
=a f d-x)]] nods pf | [ 1®@@ ax 
C, aes Cc 


” j=l 


n k : 
+2 > | Ce Nei eee (17.130) 
jan 1On 


X1 Xi 


Aomoto proved two lemmas that revealed how the third integral in (17.130) can be 
written in terms of J; and J,_1. Lemma | states, 


Thai x72) " if2<j<k 
st eats ES 


or ed sl, ifk <j <n. 


63 Dyson and Mehta (1963).. 
64 For more history of the Selberg integral, see Forrester and Warnaar (2008). 
65 Aomoto (1987). 
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In the case 2 < j < k, the transposition x, <> x; changes x; — x; to x; — x; and 
em x; remains unchanged. Hence the value of the integral in this case is zero. On 
the other hand, when k < j <n, the same transposition effects the change 


x xX; x 
1 ss J = 1 
X1— Xj Xj — Xx] xX] — Xj 
so that 

k k 
~ 1, x; B(x 

2 Liste RGD = / Il x; P(x)dx = Ik_. 

Cn x1 aa Cn i=? 


Lemma 2 states, 
= i if2<j<k, 


| ni TTint HOO), 
z Xp Xj Tk ifk<j<n. 


When 2 < j < j, the transposition x] < x; produces 


2 
eG XX; tx; 
> = X1xXj — 
xXj- 


xX} — Xj 


Xt Xj 


proving the first part. For the second part, note that 


Note that the last term changes sign in the transposition x; <> x;, so this term leads 
to the value zero for the integral. The first term, x1, produces J,, proving the second 


lemma. 
Applying these two lemmas to (17.130) gave Aomoto the relation 


k)Kk-1 — yan —k—l)k, 


O=ahk-1-(@+B)k+y(a 


or 
_k 
gee Dy (17.131) 


i= 
a+ B+(2n—k-—1)y 


Iteration of (17.131) k times produced 


k k 
= a+(n—i)y 
he I] ers I] a+ B+ Qn—i—ly is D(x)dx. (17.132) 

= nin 


Note that the last integral is Selberg’s integral, S,(a@,6,y). Taking k 


(17.132), produced 


480 The Gamma Function 


- a+ (n— jy 
S@t+1py)=]] Sn (a, B,Y) (17.133) 
ja @ B+ Qn j—Dy 


4L(j—-1 
=T] oS tbr). 
(1 


= Il 


B+a+j-Dy 


This implied that when @ was a positive integer 


n a-l 
m+(GU—l)y 
Sn(o, B.y) = Hi ma bd apo Dy BY): (17.134) 


With 6 also a positive integer, symmetry in a and 6 gave 


n B-1 : 
= Edge ayy 
Sn(1,B,Y) = it I] Tel4@ajoby Hy): (17.135) 


Observe that 
—1 . 
‘ 1+(j—Dy 


i m+(¥—ly ia 
i ey Sy ae ier gf Sy 


T(a+ Cs Dy) r(B + (j - Dy) r(2+ (n+ j- ly) 


= +. (17.136) 
P(at+pt+(n+j-Vy) (rd +@-— jyy)) 
Combining (17.134), (17.135), and (17.136), Aomoto obtained 
* +(j-1 +(j-1 
Sn (o,B.y) = Duly) II SUA DORETUSDD ais 


T(a+B+m+j—Dy) 


where D, (vy) depends only on y and n, and not on a and fp. 

Formula (17.137) was derived earlier by Selberg, using a different method. From 
this point on, Aomoto’s proof followed Selberg’s. To determine the value of Dy (y), 
Selberg took y to be a positive integer. Observe that with a = 6 = 1, (17.137) 
becomes 


(l+G—-Dy)r(l+G—-Dy) 
r(2+@+j-Dy) ; 


Sn, Ly) = in jacoPrax = Duo T]* 
(17.138) 


Now, following Selberg, take x, to be the largest of the variables x1, x2, ...,Xn; 
since any of the n variables could be the largest, it follows that 
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1 Xn Xn 
Sly) =n f (/ af ((%n — x1) +++ Gn — Xn)” 
0 0 0 


AG sis sane ede diy-1 ) 


Make the change of variables x; = Xntj,i = 1,2,...,2 —1. Then 


i 1 1 
Silty) =n f mae | of (A =x) ++ = x91)" 
0 0 0 


[AG XD day = dXn-1 
1 
= @= eed S11, 2y + 1,2y) 
_ _ Pn-i) 1 ea a2 (1+G+Dy) 
Ge yak ds r2+@+j-Dy) 


, (17.139) 


where equation (17.139) follows by an application of (17.137). Equating (17.138) and 
(17.139) and simplifying, we arrive at 


= rd +ny) 
Dn(y) = Tay. Dn-1(Y); 


and, since Di(y) = 1, we finally have 


_nrda+iy) 
Daly) = it Td (17.140) 


Combining (17.140) with (17.137) produces 


T(a+G—Dy)P(6+G—Dy)l + jy) 


; , (17.141) 
Ta+p+n+j—-Dy)rd+y) 


Si(a,B.v) =| | 


j=l 


proving (17.127) when a, B, y are positive integers. To extend the result to Re a > 0, 
Re B > 0, Re y > 0, Selberg applied Stirling’s approximation, 


T(@) ~ V2 w°-2e-? for |w| > o andRew > 0, 


to the expression—call it T;—inside the product in (17.141) to obtain 


. A a2 pei ee re igre 
Pn st Wg kee oad aD ~. (4G-¥ eye -' 07.142) 
EVASY> Gaga Gay 2 as 
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Note that as a gets large and 1 < j <n, the expression 
Gj- 12 
he oy heen eel 
(n+ j-2*2 
is bounded and, similarly, as 8 becomes large, the expression 
(n — jP2 
for = eens 
(a+ j-1P3 
is bounded. Selberg observed that for 1 < j <n, 


Os Dr a os 
Cr) ae 


This is not difficult to show. For 2 < j <n—1, 
PGP Sei 2) 7, 
and 
HPG-DOla— prt s @F-2BUIP@- pri s—j+2j-—HrIvry, 


Thus, (17.142) is bounded for Re y > 0 when y — oo and the right-hand side of 
(17.141) is bounded for Re a > 0, Re 6 > O, and Re y > O. Since the integral 
S,(a@,B,y) is on a n-dimensional cube, it follows that S,(@,6,y) is bounded for 
Rea > 0, Re 6 > 0, and Re y > 0; moreover, S,(a, 6, y) is clearly analytic for 
these values of a, 6, y. By Carlson’s theorem, then, (17.141) must hold for Re a > 0, 
Re 6B > 0, and Re y > 0. 

In his lectures on special functions, delivered around the period 1990-1992 at the 
University of Wisconsin, Richard Askey derived Selberg’s integral using a method 
different from (17.133); we sketch this method: From (17.133) it follows by symmetry 
in @ and # and then by iteration that 


7, (a+ B+ (2n-j- Dy) 
Sn 9M y= 
val i (6+@—Dy), 
Sleep Or =7=Dy); 
(6+(@— jy), 


K Sa, B+ky) 
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Let k — oo and apply (17.2) to obtain 


. r(B+(— jy) 


Sn(a@, B,Y) = : 
" it T(a+ B+ (2n—- j -1)y) 
lore) co 
[ of [Date T] i - 2) P%ax. (17.143) 
0 0 j=1 l<i<j<n 


Denote the integral in (17.143) by M,,(a@, 8). Symmetry in w and £ and (17.143) 
imply that 


Mn(a, y) = M(B, 7) 
[ait@+m—py) Mar@+a—py) 


= Dry), 


because only if it is independent of @ and f can a function of a and y be equal to a 
function of 6 and y. Therefore 


“ T(at(a— jy)P(Ba— jy) 
={] 


Daly). 17.144 
T(a+B+(2n-j-1)y) Y) 


This time, we compute D,(y) by a modification of Selberg’s method. Observe that 
by the symmetry in the variables x1,x2,...,Xn, 


1 1 1 
/ @(x)dx =n! | / of n ®(x)dx1---dXpn. (17.145) 
Ch 0 Xn x2 


Now for a differentiable function f, integration by parts gives 


1 
lim a i: 1°! F(r)dt = f (0), 
0 


aot 


a result also true for a continuous function. So multiply (17.145) by @ and let a > 0+ 
and use (17.144) to find 


of TI (27a xp) Tb aj? ae 
C; 


m1 j=] 1l<i<j<n-1 


2 r(e@+G-Dy) AL. 
= D, —l)y). (17.146 
(y) I re+a+i—2y) L [de (G-Dy). ¢ ) 


Again by (17.144), the integral on the left-hand side of (17.146) also equals 
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P(Qy + G — Dy)T(b+G — Dy) 


17.147 
y(2y+B+(+j-3)y) 


nDn-1(7) i 


Comparing (17.146) and (17.147), reveals the functional relation 


nl (ny) Tiny + 1) 
Fp ae Ca gn eel a 9 
ry) (7) Fo +) i(y) 


Dn(y) = 


and therefore 


Ur JY). 
Day) = Wee ia 


completing Askey’s evaluation of the Selberg integral.°° 


Also note that (17.143) implies that for Re a > 0 and Re y > 0, 


lee) oo fn 
/ . al I] iar a I] xi — xj |?% dx 
0 0 j=1 


1l<i<j<n 
ai P(w+(j- Dy) + jy) 
a rd+y) 
Now if we take a = 6 and x; = 3 (1+ 36), = 1,...,n, and leta ~ o, we 


get the result, for Re (y > 0), 


ne hs Is, gg Onis |] ee. 
[ i oo( sb) I] |, —tj|"' dt = ent TT rd +y) 


l<i<j<n 


17.13 Good’s Proof of Dyson’s Conjecture 


One implication of Selberg’s integral formula is an integral that arises in physics, 
known as Dyson’s integral: 


= r 1 
[- “fll ie? — eit |2Y dO, ---dOq = ny? OLED 
Bes ye ag (Py +0) 


In fact, this is a particular case of Selberg’s integral. It can be used to find the 
constant term in the product 


k 
Il (1 = 2) (17.148) 


66 See Andrews et al. (1999) pp. 405-406. 
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where k is a positive integer and/ 4 j. To verify this claim, observe that for z; = eli, 


k k 
= ee id Zj 

lz7 al = @ =a) Gp — a) = (1 - ) (1 ~ <i) ; 
zj ZI 


Also note the fact that any power other than 0 of z; vanishes on integration. So the 
constant term in (17.148) is given by 
(nk + 1) > (nk)! 
Tk+)"  (k)" 


In 1962, Dyson conjectured that if a),a2,...,a, were nonnegative integers, then 
the constant term in the product 


( “1) (aj +a. +--++ay)! 
I] 1-— was . 
XI 


aj!a2!-++dy! 
jl 


j#él 


Wilson®’ and Gunson® independently proved Dyson’s conjecture, and in 1970 
Good provided a short proof, starting with the Waring—Lagrange interpolation 
formula and deriving equation (9.31): 


ll<k<n oe 
J#k 
He then set 
Fy (a1,a2,..-.4n) =] | 1-— ’ 
: XI 
j#l 
and multiplied (9.31) by Fy, (a1, a2, ...,@n) to arrive at the recurrence relation 
n 
F,,(a},42, .-+54n) = Se F,(a1,...,4j —1,...,4n). (17.149) 
j=l 
He next let C. T. F,, (a1, a2, --+ ,a,) denote the constant term in F;,(a1,a2,...,dn) 


so that (17.149) implied that 


n 
C.T. Fy(ai,a2, +++ an) = >> C.T. Fa(ai,...,4j —1,...,4n). 
j=l 


67 Wilson (1962). 
68 Gunson (1962). 
69 Good (1970). 
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Good then observed that C. T. F,,(0,0,...,0) = 1, and if a, = 0, then 


C.T. Fiy(a1,42, +++ dy) =C. T. Fy_-1(Qq, ..., dk, Ak, --- Qn), 


because only the nonpositive powers of x; appeared in Fy(a1,a2,--- ,d,). He then 
noted that, in fact, 


(apt ayes Fay)! 
ai! a2!---an! 


satisfied the same recurrence relation (17.149) and initial conditions, completing his 
proof. 


17.14 Bohr, Mollerup, and Artin on the Gamma Function 


In 1922, an important Danish language textbook on analysis was published, written 
by colleagues at the Polytecknisk Lareanstat, Harald Bohr (1887-1951) and Johannes 
Mollerup (1872-1937). Bohr gained early fame in 1908 as a member of his country’s 
silver-medal Olympic soccer team; he later worked on the Riemann zeta function and 
did his most original mathematical work in creating the theory of almost periodic 
functions. Bohr and Mollerup’s four-volume work, Laerebog i Matematisk Analyse in 
various editions, had a profound effect on the teaching of analysis in Denmark, greatly 
raising the standards. In this work,’° they applied the idea of logarithmic convexity to 
prove that the gamma integral equaled the infinite product, that is, 


lee) n! nn-l 
/ e*x"—'dy = lim (17.150) 
0 n>oom(m+1)---(m+n—1) 
They started with the right-hand side of (17.150) as the definition of (x). They 
relied on a definition of the Danish mathematician Johan Jensen of a convex function 
as a function @(x) with the property that for every pair of numbers x; < x3 and 


x2 = “ues 
(x2) < eee (17.151) 
When ¢ was continuous, Jensen noted that (17.151) was equivalent to 
(tx; + 1 — t)x3) < to) + — 163), O<t <1 (17.152) 


for all pairs of numbers x; < x3. We note that Jensen’s definition of convexity arose 
from a study of inequalities. Jensen’s work is discussed in our Section 6.7. 

In proving (17.150), we follow closely the Bohr—Mollerup notation and argument; 
by contrast, textbooks usually follow the treatment of Artin. The result we now refer 
to as the Bohr—Mollerup theorem was not explicitly stated by Bohr and Mollerup but 


70 Bohr and Mollerup (1922) vol. 3. 
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follows from their argument. Bohr and Mollerup denoted the integral in (17.150) by 
A(m) and then observed that A(x + 1) = x A(x) and A(1) = 1. Moreover, by the 
Cauchy—Schwarz inequality 


2 
(4 (= =) < A(xj)A(x3), 0 < x1 < x3. (17.153) 


Observe that this result is equivalent to the logarithmic convexity of A(x) because 
A(x) is continuous and (17.153) implies 


InA (“5*) < ; (In A(x,) + In A(x3)) - 


For a definition of logarithmic convexity, see Section 3.2. Following Bohr and 
Mollerup, write xy = 3. They then set 


A(x) 
Px)= 17.154 
(x) Fo) ( ) 
where I"(x) was defined by (17.4). It followed that P(1) = 1 and 
Pix+1)=P(x) for x>0. (17.155) 


Bohr and Mollerup then used (17.153) to show that P(x) = 1. For this purpose, 
they noted that when n was an integer, 
Paxp+nlo3tn) _ 


lim = 1, 17.156) 
moo (P02 +m)! 


because 


T(in+ x) Bo (n—1l+x)(n—2+4+-x)---xF(x) _ 


In! noo n*—!n! 


1. 


Now by the periodicity of P(x) given in (17.155), they had 


P(x)P (x3) _ PQ +n) P(a3 +10) 


(P(x2))? (P(x2 +n))? 


By letting n — ow in this equation and using (17.153) and (17.156), they could see 
that 


— Pa) POs) 
(P(aa))? 


Next, they supposed P(x) was not a constant. Then, because P(x + 1) = P(x), it 
was possible to choose x1 < x2 such that P(x1) < P(x2). They took the sequence 
X1 < x2 < x3 < x4 <--- such that the difference between two consecutive numbers 
was always the same, that is, equal to x2 — x;. This meant that if x,-1 < x, < X41 
was a part of the above sequence, then x, = Anat By (17.157), they obtained 


(17.157) 
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P(x) . PGs) . Pea) 
Pay Pe) Puy? 


(17.158) 


implying by induction that 


Pn) ea 
Pa) ~ \Poay/ 


They could conclude that P(xn) — co as n — oo. They noted that P(x) was 
continuous on [1,2] and hence bounded on that interval. So they got a contradiction 
by the periodicity of P. Thus, P(x) was a constant, necessarily equal to 1, and their 
proof was complete. 

Emil Artin (1898-1962) saw that the Bohr—Mollerup proof of (17.150) could be 
simplified if (17.152) instead of (17.151) were used for convexity.’! Artin’s argument 
applied Holder’s inequality to show that 


In(A (tx; + (1 — t)x3)) < t In A(xqy) + — 12) In Axa), for O<t <1. 


He then proved more generally that if f(0) = 1, f(x + 1) = xf(x), and In f(x) 
satisfied (17.152), then 


n! n*—} 


io) nore pas@ cacy 


Artin’s proof was quite short. Note that by the first two conditions, In f(n) = 
In(n — 1)!, when n was a positive integer. Next, let 0 < x < 1. With x; = n, 
x3=x+n+1landt= cee in (17.152) Artin had 


(x+1)Inn!<xIn@a—1)!4+In f2n+1+4+x). 


This simplified to 
ni nt! 


n 
x(Qx+l)--(@tn—l nt+x 


= fi): 


Similarly, with xj =n+ 1, x3 =n-+2andt = 1 — x, he had, after simplification, 


f(x) < 


n nrx 


n! (n+1)* _ nin*! Cay n+1 
KOI) en), x@ #1) @tn=1 


The two inequalities yielded the required formula when n — ow. Artin was 
a number theorist and algebraist. In algebra, he was a disciple of Emmy Noether 
(1882-1935) and advocated a very abstract point of view. It is therefore interesting 
to see him make this contribution to special functions. Some of his other results in this 
area are mentioned in the exercises. In addition, the reader may refer to Sections 3.2 
and 3.4, for the use of logarithmic convexity by Wallis and Stieltjes. 


71 Artin (1964) pp. 14-15. 
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17.15 Kummer’s Fourier Series for In (x) 


By an interesting application of definite integrals, in 1847 Kummer derived the Fourier 
series for In P(x).’* This formula is important in number theory, although Kummer’s 
purpose was to obtain a new derivation for Gauss’s multiplication formula for the 
gamma function. The latter can be written as 


n—1 
k 1 1 
Soin Tr (x + *) = 5" —1)In 27 + 5 —2nx) Inn+InT(nx). (17.159) 
k=0 is 
Kummer explained why he thought of the Fourier series in this connection. Suppose 


(oe) CO 
f(x) = Ao +2 5° Agcos 2kax +25 By sin 2kwx, forO<x <1, (17.160) 


k=1 k=1 
where 
1 1 
At = i f(x) cos 2kaxdx, Be= / f(x) sin 2kax dx. (17.161) 
0 0 
Then 
n—1 i le) le) 
ei (x He *) =n| Ao +2) ° Angcos 2knwx +2 Y~ Bnx sin 2knax J. 
k=0 H k=1 k=1 


(17.162) 


Moreover, by denoting 


CO CO 
F(x) = Ao +2 ys Ang cos 2kax +2 » Byg sin 2krx, 
k=1 k=1 


the right-hand side of (17.162) was nF (nx). Thus, equation (17.162) was suggestive 
of Gauss’s formula (17.159). So Kummer took f(x) = In I(x) in (17.160). Then by 
Euler’s reflection formula 


In P(x) + In PU — x) = In 207 — In (2 sin rx) 


[e,2) 


cos 2kmx 
= In 27 — ) —_—_., 17.163 
n 27 2. rs ( ) 


where the last relation followed from Euler’s Fourier series for In(2 sin 2x). By 
(17.160), 


CO 
In P(x) +In PU — x) = 2A0 +) 4.Ag cos 2karx 
k=1 


72 Kummer (1847). 
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and hence, by (17.163), 


1 


1 
Aime a Oa, ye ae, 
prea ee Te 


Kummer had to work harder to find B;. He started with Plana’s formula 


PP lee aa dt 
In T(x) = x+1 , x>0, 
0 1-t In t¢ 


an integrated form of Gauss’s formula for Y(x). So he had 


a ae elt in 2k 
a= | i ( a i) pee LF 
0 JO 1-t In t¢ 
Since 
1 1 r 
i sin 2kmx dx = 0, i x sin 2knx dx = ——— 
0 0 2k1 
and 
1 
-1.- (1 — t)2kr 
t*—! sin 2kx dx = : 
[ SR EN eel Fo AR) 


Kummer reduced (17.165) to 


; -[ —2k1 ep) dt 
Jo \eCdn 2 +4222) | 2k) Int’ 


Then, with t = e~2A7™, 


pee ~ : gine ae 
Qka Jo \L+u2 u 


ey eee 
2n Jo 1+u2 u 


Kummer then employed a result of Dirichlet: 


| ee ee ax Go 1 du 
2n = 2x Jo - l+u) u’ 


When k = 1, 


where y was Euler’s constant. See Exercise 3b in this connection. Therefore, 


(17.164) 


(17.165) 


du 


Ye de feet ae a ke ae 1 
By — du t 
2x «2m Jo u Qn Jo \Itue lt+u 


u 
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The first integral In 27 and a change of variables ft to ; showed that the value of 
the second integral was 0. Thus, 


Ve 
By = —+—In2z. 
: oF og 


To find B;, he observed that 


1 co ,-2mu _ ,—2knu 1 
kBy — By = / E © du = —Ink. 
0 u 8 


Thus, 
1 
Bea Pn 2h) S128, as 
2Qkr 


and Kummer got his Fourier expansion: 


sin 2kix. 


1 i Qnk jas In 27 +2Ink 
In T(x) = 5In 22 + eee n 
IU 


2k 
k=1 k=1 


17.16 Exercises 
(1) Show that by taking k = p+qVJ/-1, p>Oin 


[Pte 
0 


kn? 
we get 
wa T'(n) cos né 
/ x" le" PX cos gx dx = eA CONE 
0 hg 
and 
ee I'(n) sin n@ 
i. x21 Px sin qx dx= Oe 
0 ci 


where f = (p* +@)2 and tan @ = a 
P 


Deduce that 


492 


(2 


wm 
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C08 x a PSUS oy IU 
x=./-, x= f/—, 
0 vx 2 0 vx 2 


oS sin © sin W 
/ en Px EI ay = 6, and / sy ee 
0 0 Xx 2 


All these definite integrals appeared in Euler’s paper of 1781, though 
he had evaluated some of them earlier by other methods. See Eu I-19 
pp. 217-227. Euler’s deductions were formal and he was the first to make 
use of complex parameters in this way. Although he initially assumed the 
parameter p > 0, he let p — 0 to obtain the later integrals. He expressed 
this by setting p = 0. He had some reservations about presenting the final 
integral for sin byt numerical computation convinced him of its correctness. 
This paper was influential in the development of complex analysis by Cauchy. 
Legendre and Laplace referred to it when extending its methods to evaluate 
other integrals and these results motivated Cauchy to begin his work on 
complex integrals in 1814. 


If Euler found a formula interesting, he often evaluated it in more than one 
way. Complete the details and verify the steps of the three methods he gave to 


show that the integral lee Insingdgd = — 5 In 2. 


(a) Euler began by setting x = sin @ and then cos ¢ = y to get 


[ In x hin JC — y?) 
x= dy 
0 


sete eee Ne ae 
1— x? 0 Jl—-y? 


2  ,4 8.6 
(+ et E+") 
--|f dy 
0 


2 Ae a Mee. eee ere. 
~ 9ho2 "2.42 © 2.4.62 5 2-4.6-82 © 


Euler showed that the sum of the series was In 2. For this purpose, he 
noted that by the binomial expansion 


4 = + etc. 
Poe ee: ame eae eae ee Be 


He applied the substitution v = V1 — z? to evaluate 


Hence 
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(b) In his second method, Euler started with a divergent series. He applied the 


(c 


wm 


addition formula 2 sinn@ sin@ = cos(n — 1)6 — cos(n + 1)6 to get 


9 
— 229620 4 sin do 9 sin 60 + 9 sin BO 4c, 


sin 


He integrated to obtain 


1 1 1 
In sin@ = C — cos 20 — 5 cos 46 3 cos 60 Z cos 80 etc. 
(17.166) 


Then 6 = 4 gave C = —In2. Euler integrated again to obtain the 
required result. 
Euler proved the more general formula 


1 1 1 yp-leym _ 
[or txinxar= | wlxds f DY a aien 
0 0 0 


1— x? 


where X = (1 — xMya The result in (a) and (b) would be obtained 
by setting n = 2,m = p = 1. To prove (17.167), Euler set P = 
He x?—!X dx. By an argument which gives (17.37), show that 


n 2n 3n ptm pt+tm-+n Deut a 
— - s 4 a . - etc. 
m m+n m+2n 77) p+n p+2n 


He then let p be the variable and m, n be constants to obtain 


1dP 1 1 1 1 


Pdp m+p p m+p+n_= p+n 
1 1 
+ etc. 
m+p+2n p+2n 


Prove the result by showing that that this partial fractions expansion 


equals 
1 yp-l m_ 4 
foe. 
0 1 — x? 


Euler actually worked with a product for £ where Q was an integral 
similar to P. Lacroix (1819) gave the preceding simplification on p. 437. 
These results are contained in Eu. I-18, pp. 23-50. This volume is 
full of ingenious evaluations of definite integrals. Observe that (17.166) 
is the Fourier series expansion of Insin@ used by Kummer to obtain 
(17.163). Also, Euler could have derived (17.166) without using divergent 
series by expanding In(1 — e”/°), But Euler treated divergent series as 
very much a part of mathematics, a view validated only in the twentieth 
century 
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(3) The following derivation of Gauss’s multiplication formula is due to Dirich- 
let (1969) pp. 274-276. Verify the successive steps for Re a > 0: 


@) fre —e)2 =Ins. 
(b) Next 


CO 
(a) = | e's?! In sds 
0 
= a dy (> ie PO ae kee if eae as) 
0 y 0 0 
oe 1 
= ria) | dy ie ) . 
o y (1+ y)? 


(©) £nl@ = fy (eo! =x") 2. 
(d) Let 


(17.168) 


Then 


n—-1 
d k 
S= —Inf —}. 
La) 
k=0 
(e) Change a to na in (c) and subtract the result from (d) to see that 
= £ In I'(na) is independent of a. Denote this quantity by p and 
integrate to get 


n-1 


I] r (« + *) = qp*T (na). 


k=0 


—n 


(f) Changea > a +4 in (e) to deduce that p = n 


(g) Euler’s formula (17.17) implies that g = (21) "5 Jn. 
(h) Show that Euler’s formula (17.17) is obtained by applying 


re@rd—x) =>. 


(4) Show that for suitably chosen constants a, b, c, and k, 
(a) [ore ryt lay = GO,. 
zzb-1 
(b) tee ev ye! (oe —k+y)zz b- \dz) dy =T(a) {5° se dz. 
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~kzza-l 


(c) T(b) dey ay =F) ii er dz. 


ye — T@r) 
@ fo’ appa 4 = Taste 
See Dirichlet (1969) vol. I, p. 278. 


(5) Use Dirichlet’s integral formula for + ee (17.168) to show that for a > 0 


d Pye a 
y+ nraty= | zs dy 
da 0 1-y 


en 


Deduce the infinite product for '(a): 


a 


T(a+1) =aT(a)=e "" . 
l+a 14 


For details, see Schlémilch (1843). 
(6) Show that Dirichlet’s integral formula, (17.168) can be obtained from Gauss’s 
(17.173). 


(7) For 0 < b < 1, set B(b) = J5~ ve dt. Show that 


and 


Deduce that 


pes op Soi SB 
pos) -|/ fd) (17.169) 
0 


(st + 1l)(t+s) 


Observe that 


lore) 1 oo 4b-1 
B= | (| ar) ds 
o stl \Jo tts 
—= 
= dt 
0 t-—1l 


Deduce that 
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From (17.169) obtain 


0° pb-1 In ¢ 


b 
B Bit)? =2} = 2B'(b). 
(b) / _ [oor =2 [ar =28'0) 


Observe that B(b) = B(1 — b) and deduce that B'(5) =0, 


b b 
i [B(t)-dt =2 : [B(t)]?dt 
1-b 7 


and 
b 
B(b) i [B(t)-dt = B’(b). 
2 


Now show that B satisfies the differential equation BB” — (B’)* = B*. Solve 
the differential equation with initial conditions B(5) = a and B’ ( 5) = 0to 
obtain B = mcsczb. This is Dedekind’s evaluation of the Eulerian integral 
B, a part of his doctoral dissertation. See Dedekind (1930) vol. 1, pp. 19-22 
and 29-31. 

(8) Let cj, c2,...,Cn be positive constants and set f(x) = (x + c,)(* + 
c2)+++(x + c,). Show that forO <b <n 


oo yb-1 n b-1 


oo IU Ch 
be RG ae s fee)’ 


k=1 


where f’ denotes the derivative of f. See Dedekind (1930) vol. 1, p. 24. 

(9) (a) Suppose that ¢(x) is positive and twice continuously differentiable on 
0 < x < o and satisfies (i) (x + 1) = #(x) and (ii) @ (5) o(*4) = 
dg(x), where d is a constant. Prove that ¢ is a constant. 

(b) Show that P'(x)(. — x) sin zx satisfies the conditions of the first part 
of the problem. Deduce Euler’s formula (17.11). This proof of Euler’s 
reflection formula (17.11) is due to Artin (1964) chapter 4. 

(10) Suppose that f is a positive and twice continuously differentiable function on 
0 < x < o and satisfies f(x + 1) = xf(x) and 2S) Foyt (x + 5) = 
/1 f (2x). Show that f(x) = T(x). See Artin (1964). 

(11) Prove the following results of Gauss on the digamma function: 


(a) For a positive integer n, 


1 1 1 
Wiz +n) = V(z)4 fees . 
( ) zt+1 z+2 Z+n 


(b) WO) = I’) = —y = —0.57721566490153286060653. Euler com- 
puted the constant y correctly to fifteen decimal places by an application 
of the Euler—Maclaurin summation formula. About twenty years later, 
in 1790, the Italian mathematician, Lorenzo Mascheroni (1750-1800) 
computed y to thirty-two decimal places by the same method. To compute 
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y, Gauss gave two asymptotic series for W(z), obtained by taking the 
derivatives of the de Moivre and Stirling asymptotic series for In(z + 1). 
His value differed from Mascheroni’s in the twentieth place and so he 
persuaded F. B. G. Nicolai, a calculating prodigy, to the repeat the 
computation and to extend it further. Nicolai calculated to forty places, 
given by Gauss in a footnote, and verified that Gauss’s computation 


was correct. 
(c) 
W(—-z) — Vi —1) =z cot rz. (17.170) 
(d) 
W(x) — Wy) = [geo A I ol Me : 
4: ame en een ey ae a) pay as 
(e) 


wo + (z “+4 (: atte: = 


=nW(nz) —nInn. 


(f) 


1 2 n—1 
v( )+v( Jentu( )- (n— l)y —nInn. 
n n n 


(g) For n an odd integer and m a positive integer less than n, 


1 
w ( “)= y4 m cot Inn 
2 n 


n 
n—1 
2k k 
+> cos aad In (2 2cos =). (17.171) 
n n 
k=1 
(h) For n even, 
1 mim 
w(-=)=+m2 WG cos Inn 
n—2 (17.172) 
2kmx 
In {2-2 ; 
2 cos n ( cos 7 ) 


(i) 
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(12) 


(13) 


(14) 


(5) 


The Gamma Function 


In unpublished work, Gauss explained how Y(z) could be used to express 
the second independent solution of the hypergeometric equation in certain 
special circumstances. See Gauss (1868-1927) vol. 3, pp. 154-160. 


Prove that 
— T(ia+s)P(b+s)(c—s)P(d—s)ds 
20 —ji0o 
T(a+oP(a+d)P(b+c)l(b+d) 


T(iat+b+c+d) 

where the path of integration is curved so that the poles of [(c—s)I(d—s) lie 
on the right of the path and the poles of F(a + s)'(b +s) lie on the left. This 
formula is due to Barnes (1908); it played an important role in his theory of 
the hypergeometric function. It is an extension of Euler’s beta integral formula 
(17.10), as pointed out by Askey. This can be seen by replacing b and d by 
b — it and d + it, respectively, and then setting s = tx and letting t — oo. 
Suppose F is a holomorphic function in the right half complex plane 
Re z > 0. Suppose also that F(1) = 1, F(z + 1) = zF(Z) and that F(z) 
is bounded in the vertical strip 1 < Re z < 2. Then F(z) = I(z) for 
Re z > 0. This uniqueness theorem, useful for giving short proofs of several 
basic results on the gamma function, was proved by Helmut Wielandt in 1939 
and published by Konrad Knopp in 1941. See Remmert (1996), who quotes a 
letter of Wielandt explaining this and gives references. 

Prove Dirichlet’s formula: Suppose cj, c2, c3,... ss a sequence of complex 
numbers which satisfy cnx = Cn. Suppose )-7o | & converges absolutely. 


Then 
[o.6) 1 
. C, ai ye ar _~ | dx. 
ns ~ Tos) 7 — xk x 


n=1 


This is the formula Dirichlet applied to the problem of primes in arithmetic 
progressions. See Dirichlet (1969) vol. I, pp. 421-422. 


Observe that fora < 5 


(7G ~a))’ 


! 1 1 
t2-“(1 — t)2- 4% dt = ——+—~.. 
[ oe rd — 2a) 


Write the integrand as 27¢-!(1 — (1 — 21)2)*-2, apply a change of variables, 
and then use Euler’s reflection formula to obtain 


JaV (a) = 2!~4 cos (ax)0 (2a) -T (; 2 a) : 


This proof of the duplication formula is Legendre’s (1811-1817) vol. 1, 
p. 284. 


17.17 Notes on the Literature 499 


(16) Prove that 


Joo sinh?! ucosh~4 ue du 
(B—1) f° sinh’~* u cosh!~4 ue-*" du 
_ 1 of @+)6+) @4 2)(B +2) 
X+ X-+ X+ X+ 


See Stieltjes (1993) vol. II, p. 391. 
(17) From Stieltjes’s formula in the previous exercise, deduce that 


r(x-4a+4)r(x+4a+3) 
P(x+4a44)P(x-4a+ 3) 


=1¢4 
4x —a+ 4x4 4x4 4x4 
1 1 1 1 
P(x = 304 a) P(x za 1) 
P(x = 4a4 3) r(x 5a 3) 


_ 4 1? 4a" 32 — 44? 5? —da® 77 —4a2 
~ Axt  &x+ 8x+ 8x+ 8x+ 


Also show that if 
Legeae nD. 1 
2-4-6---2n J(a(n+e))’ 
2 1-3 3-5 5-7 12" 
8n — 14+ 8n+ 8n+ 8n+ 8n-4 


then g(n) = 14 


See Stieltjes (1993) vol. II, pp. 396-398. 
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Tweddle (2003) contains the English translation of Stirling’s Methodus Differentialis. 
Tweddle has helpfully added 120 pages of notes to clarify and explain Stirling’s propo- 
sitions in modern terms. For more history of the gamma function, see Davis (1959) 


and Dutka (1991). 


18 


The Asymptotic Series for In T(x) 


18.1 Preliminary Remarks 


The answer to the thought-provoking question: “How large is n! for large n?” 
was first given a good approximate answer in about 1730 by the joint efforts of 
Abraham de Moivre (1667-1754) and James Stirling (1692-1770). The story of their 
cooperation is fascinating. 

Born in France, de Moivre was a victim of religious discrimination there, leading 
him as a young man to relocate to Britain, where he worked the rest of his life. 
De Moivre’s motivation in developing an easily useable approximation for the 
magnitude of n! arose from his interest in probability theory, a subject he began 
cultivating in 1707 at the age of 40. He became familiar with the works of Jakob 
Bernoulli, Niklaus I Bernoulli, and Pierre Montmort and went on to make very 
important contributions in that field. De Moivre supported himself as a consultant to 
gamblers, speculators, and rich patrons, helping them solve problems related to games 
of chance or the calculation of annuities. He published the first volume of his Doctrine 
of Chances in 1718. This work may have led Sir Alexander Cuming (1690-1775) to 
consult de Moivre on a problem that arose in the context of gambling. Cuming lived 
an eventful and long life, becoming a Fellow of the Royal Society of London as well 
as a Cherokee chief, and yet he died in poverty. 

To set up Cuming’s problem, first let p be the probability of any toss of a coin 
resulting in a head; the probability of x heads in n tosses would then be 


b(x,n, p) = (") raspy: 


Cuming’s problem for de Moivre reduced to the calculation of the mean deviation 
n 
PS |x —np|b(x,n, p). 
x=0 


! Stephens and Lee (1908) vol. 5, pp. 294-295. 
2 Hald (1990) p. 470. 
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For example, given p = > he had the value of the sum as 


n n 
DHT (.3,): 


With n an even number, 2m, the expression reduced to 


on (7). (18.1) 


Of course, this step was the easy part. De Moivre’s real problem was to obtain a 
fairly accurate approximation for (18.1) that would also be practical to use by a person 
such as Cuming. Here recall that we write ay, ~ by asm — 00 if limp—+oo a =i 
Briefly, de Moivre first found 


1 2.168 ite Yas 
= ) is Tima (1 =) as m —> 00; (18.2) 
he then noted that 


a oie | 
(1-5) ~-— asm oo 
m 


and obtained 


1 (2m 2.168 1 
=> a a" 18.3 
Dee @ ) e /2m—1 oe 


Note that the value of 2.168 e~! is approximately 0.7976. To derive (18.3), he took 
the logarithm of the left-hand side of (18.2), expanded the logarithms as infinite series, 
and then changed the order of summation and applied Bernoulli’s formula for the sum 
of powers of consecutive integers. We present a fuller account later in this chapter. 

In June, 1729, de Moivre received a letter from Stirling and published it in his 1730 
work, Miscellanea Analytica. Stirling wrote:* 


About four years ago, when I informed Mr. Alex. Cuming that problems concerning the 
Interpolation and Summation of series and others of this type which are not susceptible to 
the commonly accepted analysis, can be solved by Newton’s Method of Differences, the most 
illustrious man replied that he doubted if the problem solved by you some years before about 
finding the middle coefficient in an arbitrary power of the binomial could be solved by differences. 
Then, led by curiosity and confident that I would be doing a favour to a most deserving man of 
Mathematics, I took it up willingly: and I admit that difficulties arose which prevented me from 
arriving at the desired conclusion rapidly, but I do not regret the labour, if I have in fact finally 
achieved a solution which is so acceptable to you that you consider it worthy of inclusion in your 
own writings. 


3 Stirling and Tweddle (2003) pp. 285-287, especially p. 285. See Tweedie (1922) p. 46 for the original Latin. 
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Interestingly, Cuming was the intermediary in connecting Stirling and de Moivre 
and, in fact, Stirling was inducted into the Royal Society of London on the nomination 
of Cuming. Stirling’s letter gave the formulas,’ first for the square of the reciprocal of 
the left-hand side of equation (18.2): 


1 (2m\\~ 
(a (7) 
( 1 (1-3)? (1-3-5)? ) 
=nn({14 ! ! bos 
2(m+1)  242!(m4+1)\(m+2) 263! (m +.1)(m +. 2)(m +3) 


(18.4) 


and then for the square of the left-hand side of (18.2) 


(a= (")) - ace ! (1-3)? 
27m \m (2m + 1) | 22(m + 3) " 9491 (m + 3)(m + 3) 


Clee ey 
} Le... J. (18.5 
2631 (m + (m+ 3y(m 49) ) en 


In his letter, Stirling did not give any indication of a proof except to say that he had 
established these formulas by the method of differences. The details are contained in 
his book,” however, and we present them in the course of the present chapter. In fact, 
identities (18.4) and (18.5) can also be proved as particular cases of Gauss’s formula 
(17.14). However, it would be eighty years before Gauss would introduce this result. 

Observe that by taking the first term of the series from (18.5), one gets a first 
approximation 


1 am 1 
32m ae SS asm > Ww, 


where the second and later terms can be neglected when m is large, leading to a more 
elegant approximation than de Moivre’s (18.3). When de Moivre received Stirling’s 
communication, he was astonished to see the appearance of z in this context. He 
searched the literature for a result that would explain this phenomenon and found 
that Wallis’s formula was just what he needed. In fact, Stirling had applied Wallis’s 
formula. 

Stirling presented his results with proofs, along with other results, in his 1730 
Methodus Differentialis. Formulas (18.4) and (18.5) appear as examples of propo- 
sitions 22 and 23. Proposition 28 discusses the problem of finding the sum of any 
number of logarithms, whose arguments are in arithmetic progression. As example 2 
of this result, Stirling gave the formula: for s = m+ 5. 


4 ibid. p. 286. 
5 Stirling (1730). 
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1 1 7 
Inm!=sIns 4 5 n2n KY FAR | paso (18.6) 


whose statement contained a recursive procedure to find the coefficients of =, for 
n = 1,2,3,.... After seeing this formula, de Moivre observed that the method that 
yielded him (18.3) would produce an infinite series for Inm!, a series he presented in 
the supplement to his Miscellanea Analytica:® 


1 1 1 
Inm!={m- Inm + —In27 —m4 
2, 2, 12m 


1 1 1 


fess. (18.7 
360m? 1260m>——-:1680m7 wee) 


The method employed by de Moivre would give the coefficient of =I as 


ae though, following the custom of the seventeenth- and eighteenth-century 
mathematicians, he did not write the general term. 

Observe that the terms of the series in (18.7), after the third, appear to get smaller as 
m increases. However, by writing out many more terms of the series, we find that this is 
a misleading impression. In fact, the series diverges so that the series (18.6) and (18.7) 
cannot converge, but de Moivre was not aware of the exponential growth of Bernoulli 
numbers. The first dozen values of the Bernoulli numbers might even suggest that 
they could be bounded. In his Supplement, de Moivre wrote that the coefficients of 
the terms after the fourth do not decrease and in the 1756 edition of his Doctrine 
of Chances, he remarked that the series converged, but slowly. Now in 1740, Euler 
proved a formula, that we have given as (16.78); we can rewrite it as 


_4yk-1 
pa ia 1) oP (14 11 be) 


Q2k=1 4 2k Q2k ' 32k | 


Using (18.8), this implies 


w-o((") 


To find an approximate value of Inm!, de Moivre added the terms given in our 
(18.7). This gave a good approximation because the error term had the same order 
as the last term employed, that is s. The eighteenth-century mathematicians were 
intuitively aware of this method of summing divergent series of this type, so that they 
summed the terms of the series as long as the terms decreased, and stopped summing 
when the terms began to increase. Such divergent series are known as asymptotic 
series; although such series diverge, they yield very good approximations of the series 
they represent, as long as as a suitable number of terms of the series is utilized. 


6 de Moivre (1730b). 
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On this basis, we can see that the first three terms given in (18.7), expressed as 


1 1 
(m 5) Inm4 5 In2x -—m= In ( 2am m™ a”), 
approximate In m! to the order of = when m is large. Thus, we have the approximation 


m\~ V27 mie, m— Oo. (18.8) 


Though (18.8) actually follows immediately from de Moivre’s (18.7), the first 
person to point out (18.8) was Euler in a letter to Goldbach dated July 4, 1744.7 The 
result (18.8) is now called Stirling’s approximation. By using Stirling’s series (18.6), 
we obtain the approximation 


m+4 mt 

m\~ J20 ( ‘) (18.9) 

e 

In the Miscellanea Analytica of 1730, de Moivre first derived an asymptotic series 

for the logarithm of the left-hand side of (18.2), that in modern notation can be 
rendered as 


In (sm (7)) = (2m = 5) In(2m — 1) — 2mInQm) + n2— 5 Inn) 


22m m 


~ Box 2 1 
a ; (18.10 
dX (2k — 1)2k (a5 (2m — oer) ( ) 


note the similarity with (18.7). In fact, de Moivre proved both (18.7) and (18.10) 
by applying three formulas: the Mercator-Newton power series for In(1 + x); Jakob 
Bernoulli’s formula for )°7_, k’"; and Wallis’s formula for 2. We give the details 
later in the chapter. Observe that Stirling’s series (18.4) and (18.5) for the square of 


1 


2m : 3 : oer . 
ae and its reciprocal are convergent series. By contrast, de Moivre’s series for 
m 


1 


; 2m)\. ‘ ‘ : : 
the logarithm of sa ( ) is an asymptotic series, a series that diverges but whose 
m 


first few terms give a very good approximation. 

Soon after the work of de Moivre and Stirling, Euler and Maclaurin independently 
discovered a very general formula from which (18.6) and (18.7) could be easily 
derived. In his paper on the gamma function,® Gauss referred to Euler’s derivation. 
He noted that though Euler stated the result for In "(x) when x was a positive integer, 
Euler’s method applied to the general case so that for any real x > 0 


7 Fuss (1968) vol. 1, p. 283. 
8 Gauss (1813) art. 29. 
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InPoth= : 1 tin? z : 2 
n = { n { n t t } } 
x x 5 x—-x 5 IU 6G 


t tc. 
[oe Seas tae 


(18.11) 


Note that in his 1813 paper, Gauss denoted the absolute values of the Bernoulli 
numbers by A, B, C, D,... and used his own notation, II(x), to mean (x + 1). He 
also stated that it was important to understand why the first few terms of a divergent 
series might yield an excellent approximation. He also pointed out that the series 
(18.6) and (18.7) of Stirling and de Moivre respectively, when put in their more general 
form for In (x + 1), could be obtained from each other by the duplication formula 
for the gamma function. Note that the case n = 2 of (17.15) gives the duplication 
formula. 

In 1843,? Cauchy gave an explanation of the peculiar nature of the series (18.11). 
He proved that 


1 1 
w(x) = In T(x) (: s)in xa qian 


> Bok ORs 
m+ 
kel (2k _— 1)2kx2k-1 (2m ie 1)(2m + 2)x2m+1 where < < ( ) 


This implies that when m terms of the series (18.11) are used, the error is less than 
the (m + 1)th term and has the same sign as that term, determined by the sign of 
Bom+2. Thus, the eighteenth-century mathematicians had the judgment to choose an 
ideal stopping point in the series for their numerical calculations. Although Poisson 
and Jacobi had already proved Cauchy’s result in a more general situation, they did not 
specifically note this important particular case. It is also possible that Cauchy wished 
to show how an integral representation for j4(x) in (18.12), due to his friend Binet, 
could be used for the proof of (18.12). 

Jacques Binet (1786-1856) studied at the Ecole Polytechnique from 1804 to 1806 
and returned to the institution as an instructor in 1807. His main interests were 
astronomy and optics, though he contributed some important papers in mathematics. 
He was a good friend of Cauchy, and in 1812 the two generalized some results on 
determinants and took the subject to a higher level of generality. In particular, Binet 
stated the multiplication theorem in more general terms, so that his work can be taken 
as an early discussion of the product of two rectangular matrices. In an 1839 paper of 
over 200 pages,!° Binet gave two integral representations for (x) in (18.12). These 
are now called Binet’s formulas. In applying integrals to study the gamma function, 
Binet was following the trend of the 1830s. Thus, he used Euler’s formula for the beta 
integral i x™-lq — x"! dx to prove Stirling’s formulas (18.4) and (18.5). 

Although the asymptotic series (18.11) and similar series were used frequently after 
1850, it was in 1886 that Henri Poincaré gave a formal definition. He noted that it was 
well known to geometers that if S,, denoted the terms of the series for In "(x +1) up to 


9 Cauchy (1843b). 
10 Binet (1839). 
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and including (say) as then the expression x7”+! (In D(x + 1) — S,) tended to 0 
when x increased indefinitely. He then defined an asymptotic series: 


I say that a divergent series 


where the sum of the first n + 1 terms is S,, asymptotically represents a function J(x) if the 
expression x” (J — S;,) tends to 0 when x increases indefinitely. 


He showed that asymptotic series behaved well under the algebraic operations 
of addition, subtraction, multiplication, and division. Term-by-term integration of 
an asymptotic series also worked, but not differentiation. Poincaré noted that the 
theory remained unchanged if one supposed that x tended to infinity radially (in 
the complex plane) with a fixed nonzero argument. However, a divergent series 
could not represent one and the same function J in all directions of radial approach 
to infinity. He also observed that the same series could represent more than one 
function asymptotically. Poincaré applied his theory of asymptotic series to the 
solution of differential equations, though the British mathematician George Stokes 
developed some of these ideas earlier, in the 1850s and 1860s, in connection with 
Bessel’s equation. 

The Dutch mathematician Thomas Joannes Stieltjes (1856-1894) also developed 
a theory of asymptotic series; following Legendre, he labeled it semiconvergent 
series. Stieltjes’s paper!! appeared in the same year as Poincaré’s, 1886. Then in 1889, 
Stieltjes extended formula (18.12) to the slit complex plane C~ = C \ (—oo, 0]. Until 
then, the formula was known to hold only in the right half-plane. He accomplished 
this extension by a systematic use of the formula!” 


@= fot om (18.13) 
cae z+t oe 


where jz was defined by (18.12). 


18.2 De Moivre’s Asymptotic Series 


De Moivre’s derivation of (18.7) in the Supplement! started with 


m—l m1 —1 m—1 
m k k 
Lame ae oe n(1-~) =-Y (1-5) 


k=l k=l 
m—1 oo Kn oo 1 m-1 

= ye = ae (18.14) 
— al nm" n=1 nme k=1 


11 Stieltjes (1886a). 
12 Stieltjes (1889). 
13 de Moivre (1730b). 
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We remark that de Moivre did not use the summation or factorial notation. He 
effected the change in the order of summation by writing the series for In(1 — £) in 
rows, for some values of k, and then summing the columns. He then reproduced Jakob 
Bernoulli’s table for the sums of powers of integers and applied it to the inner sum in 
the last expression of (18.14). In modern notation, Bernoulli’s formula, also given as 
(2.26), may be written as 


m—-1 

— yeti 1 
yt = Pm 
= n+1 2 


B B 
@ 5 (m 1"! 4 6) rag Dr34..., (18.15) 
Thus, de Moivre had the equation 
m1 f (m1)? m= igh 
GES = aa ae PPG con®) 
1 (m-1)  @-1P  m-1 
2mm 3 2 6 
1 ((m-1)* (m—-1)3 3B, (m — 1)” 
3m3 4 o 2 
i Ly = yt =i}? —1 
Gn 1)? Gnd) dns ) ig ) Sais. 
4m4 5 2 2 4 


He then set x = mat and changed the order of summation to get 


1 mn! x2 x4 1 x2 x3 
i te ahs Sue wc es ead PO eee eee 
Geet Ne Gs Dna ae 


_ Bo (ogi spy 3% _ Ba ba in eee | 
t= (x42 pr xX re) TS (x4 Se ae a 1 es Is ha (18.17) 


The general term, left unexpressed by de Moivre, as was the common practice, 
would have been 


Bo, ar\ x (2r+l\ xt, (2r+2 ie eae 
2rm2r—! 1} 2r 2 2r+1 3 2r+2 


a Bo, 2r —1 2r\ 9 2r+1\ 3 
~ 2rQr — Dm (( 1 J+ (Z)e+/ 3 )s a 


Ba —2r+1 
= 2r(2r — Deal (1 5 ea 1) 


B>, 1 
eae: earn (18.18) 
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The second series in (18.17) summed to 


= 
ind —x)= in( it ) =i, 
m 


and the first series turned out to be the integral of this series, equal to 


a ee 
(=n nea pees 


m 


This computation involves integration by parts, and it is interesting to see how de 
Moivre handled it. He used Newton’s notation for fluxions, as was only natural since 
de Moivre worked in England and was Newton’s friend. He set 


1 x2 3 4 


x x 
v=In =x4 
1-x 2 3 4 
Then 
1 1 1 
UX = Xxx 4 ais rue x4x 
1 1 1 1 
Fux = 5h at Til i +--+. (F = fluent = integral). 
He then set 
1 1 1 1 1 1 
g =x — Fux =x’ 4 st 3 rail a (53 ra 3 ) 
so that 
3 4 5 1 1 
gx | es a ee Ee a ea ea 
2 3 4 2 3 4 
XX oak 
= =—-X+ 0. 
l1-x 
Therefore g = —x + v and F.vx = vx —q = moti Using the above 
simplifications, (18.17) became 


1 


1 
n =(m—1)—-Inm+-=Inm 
(m—1)! 2 


B> 1 By 1 Bo 1 
1 1 1 pe. 
2 m) | 3-4 m3) 5-6 


1 1 1 1 1 
=(m-—1) Pees 


12 360° 1260 1680 — 
1 1 et 


12m | 360m3  1260m5 ° 1680m7 
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Note that we have substituted the numerical values of Bernoulli numbers: 


te ep erg ; 
Bre eS 4=> 30’ Or 19’ : 30’ ’ 
de Moivre worked only with these numerical values in his calculations. After adding 
Inm to each side and rearranging terms, de Moivre had 


1 1 1 1 1 
Inm!=(m+—=)lInm-—-m+1 
2, 12 360 1260 = 1680 
1 1 1 1 


Pores 18.19 
12m 360m3 ~~ 1260m>—-1680m7 ( 


De Moivre remarked that the constant in this equation could be quickly computed 
by taking m = 2. In that case, 


1 1 1 1 
C=1 
12 360 1260 1680 
1 1 1 1 
=2 in -++. (18.20) 
2 12x2 360x8 126032 1680 x 128 


As we have noted before, the two series here are divergent but the terms as written 
down by de Moivre gave a good approximation for C. After learning of Stirling’s 


2. 
result on the asymptotic value of si m , de Moivre realized that C = }In(2z), 


and he proved it using Wallis’s formula. Stirling and de Moivre’s derivations for the 
value of C were identical; we present the details in the next section. 


18.3 Stirling’s Asymptotic Series 


Stirling’s Methodus Differentialis gave several ingenious applications of difference 
equations. His derivations of (18.4) and (18.5) were probably his most imaginative 
use of difference equations. It is obvious from his letter to de Moivre that he was 
quite proud of his solutions. We give details of this work and derive equation (18.6). 
Proposition 23 of Stirling’s book states the problem: to find the ratio of the middle 
coefficient to the sum of all coefficients in any power of the binomial. Stirling observed 


2m : 
that the sequence 1,2, gs, 0 8 ARSE ig ae @ m = 0,1,2,... satisfied the 


relation T’ = arn where n = 2m = 0,2,4,... and T’ denoted the term after 


T. So if T was the nth term, T’ would be obtained by changing n ton + 2 in T. In 
modern notation, 


Tn42 = 
n- 
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After squaring the relation, Stirling rewrote it to get 


TT? 


277 + (n+2)(T? — T”? = 
(n + 2)( ares 


0, (18.21) 


the difference equation into which Stirling substituted an inverse factorial series to 
solve his problem. Since he had so many difference equations from which to choose, 
it is hard to discern how Stirling was guided to this one; it worked very successfully. 
Stirling first took 


Po Bn Cn Dn 
= An free 
n+2  (n+2)m+4) (n+2)n+4)(n+6) 
C—2B D—4C 
=An+B4 oe (18.22) 
n+2 (n+ 2)(n + 4) 
Then 
C—2B D—4C 
T? =A(n+2)+B4 aes 18.23 
( ) n+4 (n+ 4)(n + 6) ( ) 
so that 


pa, | _2C-4B | 4D-16C_ 
(n + 2)(T* — T*) = —2A(n + 2) 4 Fae GS ae” . (18.24) 


It followed from (18.22) after replacing T by T’ and n by n + 2 that 


is -—-A- B | C | D i 
nt+2.  (nt+4.° (n+4)(n+6)) (Nn +4)(n+6)(n+8) — 


He used the three series (18.22), (18.23), and (18.24) in (18.21), and the result was 


a 4 AC = 98 _ 6D=25C 8E —49D ee 
" n+4 9 nt+4(n+6) 2 t+4(n+6n4+8)° | 
(18.25) 


Since the constant term and the numerators of the other terms must be zero in (18.25), 
it followed that 


2B—A=0, 4C-—9B=0, 6D—25C=0, 8E —49D=0, 


so that, clearly, Stirling had the series in (18.4) except for the value of A. As 
was the practice among the eighteenth-century mathematicians, Stirling computed 
only the first few coefficients B, C, D,... and gave no expression for the general 
term. To find A, he argued that by (18.22) for large n,T? = An. Stirling then 
made an application of Wallis’s formula, to which he referred in his exposition. For 
n=2m, 
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(Cae Oe Gee oa 
n 2m\22"\m 
2? x 47 x - +. x (2m)? 2m+1 a 
. > 


~ 32% 52x ++ x (2m —1)2x2m+1 aah as n> ~. 


First in his Miscellanea and then again in the 1756 edition of his Doctrine of 
Chances, de Moivre praised Stirling for his introduction of z in the asymptotic series 
for Inn!. In the latter work he wrote, “I own with pleasure that this discovery, besides 
that it saved trouble, has spread a singular Elegancy on the Solution.”!4 

Stirling found the series in (18.5) in a similar manner. This time he let 7,42 = at; 
after squaring this relation he could thus write 


(n+ 1) + 3)(T2 — T2,,) —2(n + 1) T? -— T?,, =0. 


Stirling next let 


| B C 


"ntl (nt1n4+3)) nt+D(n+3)n4+5) 
D 


7G aaa 


(18.26) 


and applied a procedure similar to that by which he solved (18.21) to get, as before: 
2B—A=0, 4C-—9B=0, 6D—25C=0, 8E-—49D=0,..., 


relations that gave him (18.5). 

Now series (18.6) for In m! was a corollary of Stirling’s main result, contained 
in proposition 28 of his book. The purpose of the proposition was to find the sum 
of any number of logarithms, whose arguments were in arithmetic progression. He 
denoted the progression by x +n, x + 2n, x + 3n,...,z —n, where the logarithms 
were taken base 10. Since log;y.x = in ip’ he defined the number a = a and gave 


the approximate value of a to be 0.43429,44819,03252. To state his result, Stirling 
began with the series 


3 5 7 


Zz a n nr nm n’ 
f@ = on logjo Z ys T aA ; t aA2 = t aA3 3 t aA4 a) Te eS (18.27) 
where the numbers A;, Az, A3,... were such that 
m 
2m — 1 1 
Ak = —-———-——_-. 18.28 
> (3-3) «4mm + 1) eer 
He had the values 
1 7 31 127 511 
Aj=—-—, Az=sp, 43=-Tp, A4= a, A= -a: 
12 360 1260 1680 1188 


14 de Moivre (1967) p. 244. 
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In fact, one can show that 


(27-1 — 1) Box 
At = , 18.2! 
- 2k(2k — 1) ee. 


Stirling may not have recognized this connection with Bernoulli numbers when he 
discovered his result on In m!. Later, after reading de Moivre’s book, where Bernoulli 
numbers were explicitly mentioned, Stirling investigated properties of these numbers 
and discussed them in some unpublished notes. See equation (10.77) for a relation 
between Bernoulli and Stirling numbers. 

Stirling’s main theorem was that 


logig (x +n)(x + 3n)(x + 5n)---(z—n)) = f(z) — f(x), (18.30) 


where f(z) was the series defined in (18.27). His proof consisted in observing that 


2 3 
f(z) — f(@ — 2n) = logigz “(2 ;(*) 5 (2) ro) 
z 2 Zz 3 Z 


n 
= logig z — logig (1 ae *) = logig(z — n). (18.31) 


Stirling apparently left the verification of this equation to the reader. He made the 
remark that the terms in f(z) and f(z — 2n) had first to be reduced to the same form. 
The theorem follows immediately from (18.31): 


f@) — Ff) = (F@ — FE — 2n)) 
+ (f(z — 2n) — f(z —4n)) +++ + (f(@ + 2n) — f(x) 
= logig(z — n) + logi9(z — 3n) +--+ + logig( +n). 


Stirling applied his theorem to derive his series for log;g m! by taking x = 5 n= 
andz=m-+ 7 From this he had 


H(n+3)=(ne 4) mong) -o(m 


a Ta 
t etc. 


24(m+ 5) 2880(m+4)° 


lL 
2? 


Next, by (18.30), log;gm!= f (m + 3) —f (3) . Stirling wrote that 
1 1 
=f re ae logig 27a © 0.39908, 99341, 79, 
but he did not explain how he arrived at 5 logj9 27. Perhaps he numerically computed 


logigm!—f (m+ 5 for a large enough value of m and noticed that he had half 


the value of log;, 277; he must have been very familiar with this value, based on his 
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extensive numerical oe Recall that he had recognized ,/z from its numerical 
value when computing !. Of course, he could also have provided a proof using 


Wallis’s formula, as he did in the situation discussed above. 
We can prove that the sequence (18.29) satisfies the recurrence relation (18.28) by 
first setting Cy as the coefficients in the Taylor series expansion of csc x: 


CSC xX 1 En Warren sa 0 
= ——_—. <x <7. 
x x2 | ar (2k — 2)! 


From (16.81) we know that 


(27k-1 _ 1) Box 


Gre 
. Dk(2k — 1) 


Now by considering the Taylor series 


3 x5 x! 
sinx = x ; 
3! 5! 7! 
we see that 
k-1 k-1 = Ge 1 
1 2C, 1 =-. 18.32 
athe ) ——— 5 x ) pi} =x 832) 
Equating the coefficient of x2-l m = 1,2,3,..., in (18.32) produces 
CI EI Cig Cnet ef CI Cd 
(Q2m+1)! 1!Qm—2)! ° 3!Qm—4)! | Ons DOr ~ 
multiply by (2m — 1)! to arrive at 
m 
2m — 1 1 
Cy = ——-—__-. 18.33 
i) «= 4mm +1) ne?) 


k=1 
When we compare (18.33) with (18.28), it is clear that 


(27-1! _ 1) Bog 


APSE. = 
‘ x 2k(2k — 1) 


By using Lagrange’s symbolic method, we obtain an alternative perspective on Ax 
as expressed in (18.29). To see this, we invoke Lagrange’s formulas from Chapter 21. 


Taking D to be 4, 
e? f(x) = f(ixt+), 
ERA OHSIGED, 
1 1 Dk 
wa 5('-3 ea Saar: 


k=1 
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Now Stirling required the sum f(x +1)+ f(x +3)+---+ f(x +2n—1)+--- 
Let S(x + 1) denote this sum. Then 


(cP? = 2) Six) = Se 4+ =S@ =) = fo =) 


and hence 


D 
is, CO 
(:- xP Sm “amy! 
1 1 i os 
aa ( 5 (2D) 4 is Box 2 Obl hi 


are 3 2k Ba p=! ¢, 


Taking f(x) = log; ) x and 5D7'f = sf f (x)dx, and noting that D**-!Inx = 


ed ! we obtain exactly Stirling’s formula for the sum of logarithms (18.27). 


18.4 Binet’s Integrals for In T(x) 


Binet knew that Stirling’s two series (18.4) and (18.5) could be derived by Gauss’s 

summation formula. In addition, he gave an interesting proof of (18.4) using inte- 
15 

grals: 


1 1 
a (m5) =) eta tars f x"-201 —x)-20 — (1 —x))72 dx 
0 0 


! 1 1 L 1.3 
= rome or) ae ( + 2~(1—-x) +4 4(1-x)?4 Jas 
0 


1! 2! 


1 DNS « 23 13\ 3:3 15 
=B(m+,, ere, 24 Ws tea le Bat, eer 
22) th Dia) 2! rh) 


(18.34) 


When Euler’s formula, B(x,y) = ee, was applied in (18.34), Stirling’s 
formula (18.4) followed. Binet’s proof of (18.5) ran along similar lines. He also gave 


two integral representations for 


!5 Binet (1839) pp. 239-241. 
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w(x) = In T(x) (x ;) ins + x 5In2n. (18.35) 


These useful representations were: 


LX t T i I ; : . 
( ) 0 e2nt 1 f. ( . ) 


Binet demonstrated the equality of the two expressions for jz(x) by using the two 
formulas: !© 


[o,@) 
[ e *Y sin (ty) dy = page (18.38) 
°° sin (ty) e+1 2 
/ PaO aL FE eee) 


He attributed the first of these to Euler and the second to Poisson. He multiplied 
the second equation by e~*’ dt, integrated over (0,00), and then used the first integral 


to get 
ce oon ee me t dt 
/ go eres a=4f d 
0 ef —]1 t 0 (t2 + s2)(e27t — 1) 


Binet then integrated both sides of this equation with respect to s over the interval 
(x, 00), and changed the order of integration to obtain 


cour cl an | 1 eu © arctan (4) 
i PS aie 2 / sep 
0 2 t e-1 t o emt—] 
Thus, it was sufficient to prove one of the formulas for w(x), and Binet proved the 
first one, starting from the definition of w(x): 


1 1 
wer + = wey =(x+5)in(1- 2) +1 


_ - (n — 1) 
a Qn(n + 1)(x + 1)”’ 


or 


n—1 
2u(x) = 2w(x + 1) 4 PS reece (18.40) 


16 ibid. pp. 321-323. 
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By Stirling’s approximation, he had (x) — 0 as x > o, so that 


CO 
w(x) =D) (ua +k) —watk+d) 
k=0 
and hence, by (18.40), 
1 [o,@) 2 [o.@) CO 
2 = t t t t 
we 23 GE Be Dae Dies, aol 
By Euler’s gamma integral 
P@ +1) i: n—t(k-+x) 
= t dt, 18.41 
(k + x)yrtl 0 : ( ) 


and therefore 


CO 
1 oe | 1 | 
RG) a pen = the Ftd 4 ett) 1 ett3) 4. dt 
k=1 


CO 4~n —xt 
tve 
= / dt. 
0 e —1 
He then wrote 


; w=[- et a ai? 313 A 
bye = hh es Na Deed. Geass 


and an easy calculation showed that the sum of the series inside the parentheses would 


be 
B25 (2) 
e i 72 T ; 


This completes Binet’s ingenious proof of his formulas. 
De Moivre’s form of the asymptotic series for In (x) can be obtained from Binet’s 
integrals. Start with Euler’s generating function for Bernoulli numbers, 


[e,2) 


t 1 Bon 2 
=1 t4 t”. 
Pe. 2 dG) 


It follows easily that the integrand in the first integral for j1(x) is 


Ioge sat 11 —\  Bont2 an 
— a ae 18.42 
t (5 t 5) Gn +2) ( 
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We substitute this series in the integrand and integrate term by term; an application 
of (18.41) then yields de Moivre’s asymptotic series. Unfortunately, however, this last 
operation is invalid because the series (18.42) is convergent only for |t| < 277, whereas 
we are integrating on (0,00). 


18.5 Cauchy’s Proof of the Asymptotic Character of de Moivre’s Series 


In an 1843 paper in Comptes Rendus,'’ Cauchy wrote that Stirling’s series for the 
logarithm of a product whose factors increase in an arithmetic progression was 
divergent and therefore it had no sum. However, he maintained, good approximations 
could be obtained from this and other similar series for the functions represented by 
the series expansions. Cauchy’s main result in this paper was that in the series he 
denoted as Stirling’s series (though he was actually dealing with de Moivre’s series) 
and other similar series, the first neglected term gave the upper bound of the error. 
Thus, according to Cauchy’s result, if we were to take the terms (m + 5) Inm — 


m+ 5 In 2x from the series (18.7), representing the function Inm!, the error would 


1 


have to be less than the first of the neglected terms, j5,. 


employed was a finite form of equation (18.42): 


The basic formula Cauchy 


to tot oS RB 0(t) B 
( ) i - 2k+2 12k ( ) 2m+2 2", Oo 
L (2k +2)! (2m +2)! 


(18.43) 


He noted that (18.42) converged only when |t| < 22, while (18.43) was always true. 
To prove (18.43), he first derived the partial fraction expansion of its left-hand side: 


1 1 1 1 ies 2 
= ————— 18.44 
t (5 —-1 ¢ 5) 2, t? + (2k)? ( 


using his own original complex analytic methods. However, the result was originally 
due to Euler. See Euler’s equation (16.28), in which one may set t = 2im x to produce 
(18.44). 

Cauchy next observed that half of each term in the sum on the right-hand side of 
(18.44) could be expanded as 


1 1 12 p2m—2 t2m 


Gk) + ~ Gk Okay * Gem * Gen (Gknye 4) 
(18.45) 


to see this, one may employ the identity 


2 
y=! ye 1 x4 wee byt? 
Xx 


!7 Cauchy (1843b). 
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Concerning the last term in (18.45), Cauchy then noted: 


pom p2m 
: 18.46 
kx)" (Qkr)2 +2) ~ Okay" Ce) 
Next, from (18.45) and (18.46), 
1 1 1 t 1 t 
= Pote3 t+ Ri (t 
d (2k)? +12 (27)? d k2 (2n)4 dX k4 ~ Oy 4 rm D zm me 
(18.47) 
where 
2m 
0<|RmO|< Games Ly Gama 
k=1 
Hence 
OG)? Sd 
[Rn (l= Sanaa 2, Gime 
(27 )-™ = koe 
where 
(2msr)* 
0 < 6(t) = ———_,, 
SYS Oman? 
An application of Euler’s formula 
= 1 x yr! 22n-! Bo wn 
ao 
= ken (2n)! 
enabled Cauchy to rewrite (18.47) as 
CO 
Ds 28 py Bh ay. 4 Bam pam 2 (yy _Bamt?_ pam 
4 (2k)? + 1? "Al | ' Qm)! (2m +2)! 
(18.48) 


The result (18.48) when substituted in (18.44) produced (18.43); multiplying 
(18.43) by e~™ and integrating, Cauchy arrived at 


m-—-l| 
oo 1 te . th Box+2 Oe obi ak 
ey ene = = t dt 
[ (i r+5)e* cee: ‘ 


Bom+2 
(2m + 2)! 


is A(t) te" dt. (18.49) 
0 


Since @(t) is continuous and since "eX > 0 in (0,00), with 6; a constant 
between 0 and 1, 
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[o.@) [o.@) 
/ at) re dt < af rae i (18.50) 
0) 0 
Cauchy then noted that when k was a positive integer 


0° 2k)! 
/ ke dt = nan (18.51) 
0 


Thus, by (18.50), he could rewrite (18.49) as 


a 1 1 as 1\ _,, dt Box+2 
peers Sel ere 2 ace 
i ee ip oe tS Qk + DQk+ 2x 
B 
,) atid 0<6 <1. 


'(Qm + 1)(2m + 2)x2m4 1 


By using Binet’s formula, given by (18.35) and (18.36), Cauchy obtained his final 
formula: 


m—1 


1 1 B 
Inl(x) = [x Inx —x In2z 4 > ean 
2 2 = (2k + 1)(2k + 2)x2kt+1 


Bom+2 
01 : 
(2m + 1)(2m + 2)x2+1 


0<6@ <1. (18.52) 


This completed Cauchy’s proof of his contention that if the mth term in the series 
were the first neglected term, then the absolute value of the error would be less than 
the absolute value of the mth term. 

Cauchy also indicated a proof of the result first conjectured by de Moivre: For 
c> 1, 


re ae 1 1 cB, ele + I)(C+2)Ba 
ne | (n+2)¢ 2-3-4nc+3 


. (c — Inc-! "One ° 2-nctl ! 


cles Iyer 2)(c + 3)(c +4) Bo 
2-3-4-5-6n¢+5 
c(e + 1)(e+2)(e +3)(c + 4)(c + 5)(c + 6) Bg 


| 9.3.-4.5-6-8nct7 fee-. (18.53) 


Note that we have discussed the case c = 2 in (10.75), in connection with Stirling’s 
observation on the computation of Bernoulli numbers. Note also that the right-hand 
side of the series (18.53) is an asymptotic series; thus, Cauchy proved it in the 
finite form: 

m1 
cle l)-:-(e 42k) Bopeo 


CO = 
La ae 
k=0 (n + ke oe (c — 1)nc-} I 2ne u = (2k ae 2)! ner2k+1 


c(c+1)---(C+2m) Bom+2 
(2m + 2)! net2m+1? 


+ Oy 0<@ <1. (18.54) 
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In fact, de Moivre conjectured the infinite series form of (18.54), that exemplifies 
the manner in which it was used in a finite form by him and his contemporaries. 

To derive (18.54), Cauchy multiplied (18.43) by t°e~"", integrated on (0,00), and 
used (18.51) to obtain 


[ ag wa perl y—at at 
0 e— 1 t 2 t 


m—1 B oO B oO 
= me eas / prey ae / A(t) 1°+2"e—" dt 
Ls Qk+D! Jo Qm+2)! Jo 
Baas Mec+2k+0 |. Boman T(ct+2m +1) Same 
~ £4 QRF2DL nT Om p21 nee SEES 
=0 
(18.55) 


1 1 et et et boas 


1 
2 
1 
2 


~lens] ea 


Next, the first integral in equation (18.55) could be evaluated as 


[ (3 1,—at _ ,c-2, nt 4 4c ly nt 4 4c I, meng.) 
0 


They West). Vey, Ite « Fo; -_ 


2n° neo! no (n+1)& (n+2)° 


and when this was used in (18.55), after rearrangement of terms, the result was (18.54). 

As discussed in Chapter 20 in connection with the Euler—Maclaurin formula, 
(18.54) and (18.55) are particular cases of the results of Poisson and Jacobi on the 
E-M formula with a remainder term. Poisson and Jacobi derived these formulas using 
different methods in 1826 and 1834 respectively. Cauchy was probably aware of their 
results, since he published his paper in 1843. Cauchy’s work is of interest because 
it shows how asymptotic series for log T(x) and for )°?°o c > 1, could be 
obtained from explicit forms of these functions. 


Pee [som 
(n+k)°’ 


18.6 Exercises 


+3 


1 m 
(1) Prove that if S,, = /27 (“4) and Dy, = 20m (2)", then 
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This shows that (18.9), implied by Stirling’s series, is a better approximation 
for m! than D,,, resulting from de Moivre’s series but called Stirling’s approx- 
imation. Note that S,, gives values larger than m!, while D,, underestimates 
m!. See Stirling and Tweddle (1984). 


(2) Prove that 
} 1 
i nO Gu = ange 5 er (18.56) 
0 
by the following methods: 


(a) Observe that the integral is equal to 


1 1 —1 
lim — (mr +inr (x+~) ++inn (e+ "= *)), 
n>on n n 


Apply Gauss’s multiplication formula and Stirling’s approximation. 


(b) Apply Euler’s reflection formula D(x) (1 — x) = =" to compute the 
limit 


} ae 1 2 n—1 
/ Inl(u)du = lim ~ (nr ( ) int ( ) free inr ( )) 
0 n>on n n n 


1 
=5 In2z. (18.57) 


Take the derivative of (18.56) with respect to x to show that 
1 1 
i InlQ+x)du =xInx -x+f InT(u) du. 
0 0 


(c) Denote the integral in (18.57) by C and show that 
: sin vu 
26 = -{ f(u)du= where f(x)=I1n (=**) ; 
0 u 


Show that C = | In2z by proving (i) [! f(u) du = [ f (4) du and (ii) 
9) yp g 0 0 2 


fu) = f (4) +f (454) +n 27. 

The proofs in (a) and (b) were published by Stieltjes in 1878. See 
Stieltjes (1993) vol. 1, pp. 114-118. The proof in (c) was attributed 
to Lerch by Hermite in his 1891 lectures at the Ecole Normale. See 
Hermite (1891). 


Md+y) 
M+y) 


et a (l-u)x 
nr) = f ( < L (uw pe) oe 
0 ex —] x 


(3) Integrate Gauss’s formula for to obtain Plana’s formula 
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Then, by another integration, show that 


a+l lee) —ax —x 1 d 
=f nraydu= [ s i a ge) es 
a 0 x e*-—1 2 x 


Deduce that 


1 rege A 1 one 
Inl(a)=J Ina + dx. 
ae aa [ ( x =) aoe 


This is Binet’s formula (18.36) after the value of J is substituted from 
Exercise 2. This proof of Binet’s formula is from Hermite (1891). 


(4) Prove the formulas used by Binet: 


(oe) 
e ** sin(ty)dy = ———., 
[ (ty) dy Rae 


afo sin(ty) ae 2 
0) 


en —] e—-1 ¢t 
2m\\* 
(5) Let n = 2m and Y;, = (sk ( - )) . Prove the recurrence relation 


(n+ 1)M4+3)(%n — Yn42) — 22+ 1)¥n — Yn42 = 0, n = 0, 2, 4, 6,.... 


Assume 


a) a2 a3 
Yn = ! 


rl GEG) Glee nGEs 


Then employ the recurrence relation to prove that (2k — 2)az, = (2k — 3)? 
ap-1, k = 2,3,4,.... Finally, use Wallis’s formula to show that a, = 1. 
This is Stirling’s formal proof of (18.5) and is very similar to his proof of 
(18.4) given in the text. 


(6) Obtain Binet’s proof of (18.5) by observing that 


oh : ai a 
B(m+; =f x™(1-=x)-2(1= (1 =2))"2 dx, 
0 


and following his argument for (18.4), given in the text. 
(7) Note that 


=| k 
1 2m - 1+= 


k=1 m 


Now apply de Moivre’s method from the Miscellanea Analytica, given in 
the text, to obtain (18.10). 
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(8) Prove the formula (18.44) used by Cauchy in his derivation of the remainder 
in the series for w(n). Cauchy started with the infinite product for sinh ( 5), 
due to Euler, and took the logarithmic derivative. In fact, Euler was aware of 
this result and this proof. 


(9 


wm 


In the Methodus, Stirling gave two more formulas for b,, = Ga) 


ant 1 ? Lea" 
(2m+1)]1 | etc. ], 
bn 2 222m —3) 2-42Qm —3)(Qm —5) 
bn \? 1 {? i? 23? 
= 1 etc. }. 

22m 7m 2(2m —2) 2-4(2m —2)(2m — 4) 
In his analysis of Stirling’s work, Binet pointed out that these formulas were 
incorrect and should be replaced by 


ee ee yy ’, (-3), 


2 


Din 
re | x ( 
(3) ee a 


where (a)k = a(a + 1)---(a + k — 1). Prove Binet’s formulas. See 
Binet (1839) pp. 319-320. For an analysis of Stirling’s results, see Stirling 
and Tweddle (2003). 

(10) From de Moivre’s version (18.11), use Legendre’s duplication formula 
Jnl (2x) = 2 COr x + 5) to obtain Stirling’s series 


nr +) =(x+5)in(x+5)-(++5) 


1 B 7B 
| 5 In 27 zs : etc. (18.58) 


4(x+4) 96(x +4)" 


See Gauss (1863-1927) vol. II, p. 152. 


18.7 Notes on the Literature 


Hald (1990), a book on the history of probability and statistics, devotes two chapters 
to the work of de Moivre and he also discusses the work of de Moivre and Stirling; see 
especially pages 480-489. Schneider (1968) gives a thorough analysis of the totality 
of de Moivre’s mathematics. 


19 


Fourier Series 


19.1 Preliminary Remarks 


The problem of representing functions by trigonometric series has played as signif- 
icant a role in the development of mathematics and mathematical physics as that of 
representing functions as power series. Trigonometric series take the form 


1 
700 + a1 cos x + by sin x +a) cos 2x + bo sin 2x +---, (19.1) 


and these series naturally made their appearance in eighteenth-century works on 
astronomy, a subject dealing with periodic phenomena. Now series (19.1) is called 
a Fourier series if, for some function f(x) defined on (0, 277), the coefficients a, and 
by, can be expressed as 


1 20 1 Qn 
an = = f(t) cos nt dt; by = — f(t) sin nt dt. (19.2) 
JO Tw Jo 


Moreover, if (19.1) converges to some integrable function f(x) and can be 
integrated term by term, then the coefficients a, and b, will take the form (19.2). 
Thus, Fourier series have very wide applicability. 

In connection with investigations on the vibratory motion of a stretched string, 
trigonometric series of the type (19.1) were used, although the coefficients were 
not explicitly written as integrals. These researches led to controversy among the 
principal investigators, d’Alembert, Euler, Bernoulli, and Lagrange, as to whether 
an ‘arbritrary’ function could be represented by such series. This dispute began with 
d’Alembert’s 1746 discovery, published in 1747, of the wave equation describing the 
motion of the vibrating string:! 


o — =T— or — c=z-, (19.3) 
o 


! @ Alembert (1747). This paper was followed in the same year by another in the same journal, with pages 
consecutive to the earlier one. 
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where o and T were constants and y was the displacement of the string. The derivation 
was based on the work of Taylor dating from 1715.7 D’Alembert showed that the 
general solution of equation (19.3) would be of the form 


y = O(ct +x) + Vict — x), 


but the initial and boundary conditions implied a relation between ® and W. For 
example, at x = 0 and x = /, the string would be fixed and hence y = 0 for all ¢ 
at these points. This implied that for all u 


0= O(u)+ Vu), or V(u) = —P(u) (19.4) 
and 
0=Ou+l)+WVu-l). (19.5) 


By (19.4), the general solution took the form y = W(ct + x) — W(ct — x), and 
by (19.5), Y was periodic: Y(u + 2/) = W(u). Interestingly, d’Alembert’s paper also 
gave the first instance of the use of separation of variables to solve partial differential 
equations. He set 


W(ct +x) — V(ct — x) = f(t) g(x) (19.6) 
and by differentiation obtained the relation 


1 spe g” 


aie 2 

Since f was independent of x and g of t, A was a constant and the expressions for f 
and g could be obtained from their differential equations. Note that from the boundary 
conditions, it can be shown that f and g are sine and cosine functions. D’ Alembert, 
however, saw these solutions as special cases of the general solution. 

Euler responded to d’Alembert’s work by publishing his ideas on the matter within 
a few months.° Essentially, he and d’Alembert disagreed on the meaning of the 
function ®(u). D’Alembert thought that ® had to be an analytic expression, whereas 
Euler was of the view that ® was an arbitrary graph defined only by the periodicity 
condition 


P(u + 21) = O(u). 


On this view, ® could be defined by different expressions in different intervals; in our 
terms, ® would be continuous but its derivative could be piecewise continuous. The 
functions allowed by Euler as solutions of ® would now be called weak solutions of 
the equation, while d’ Alembert required the solutions to be twice differentiable. And 
while Euler allowed all possible initial conditions on ®, d’ Alembert ruled out certain 
initial conditions. 


2 Taylor (1715). 
3 Bu. II-10 pp. 50-62. E 119. 
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Euler also criticized Taylor’s contention that an arbitrary initial vibration would 
eventually settle into a sinusoidal one. He argued from the equation of motion that 
higher frequencies would also be involved and that the solution could have the form 


. NMTX nict 
yoan sin ee ard cos j 


with the initial shape 

given by >? A, sin “*. According to Truesdell, Euler was therefore “the first to 
publish formulae for the simple modes of a string and to observe that they can be 
combined simultaneously with arbitrary amplitudes.’* However, Euler did not regard 
these trigonometric series as the most general solutions of the problem. 

At this point, Daniel Bernoulli entered the discussion by presenting in 1747 and 
in 1748 two memoirs to the Petersburg Academy, eventually published by the Berlin 
Academy in 1753,> in which he explained on physical grounds that the trigonometric 
solutions found by Euler were in fact the most general possible. Bernoulli’s ideas had 
been developing for more than a decade; moreover, he explained that were based on 
the work of Taylor, who had observed that the basic shapes of the vibrating string of 
length a were given by the functions 


Bernoulli argued that the general form of the curve for the string would be obtained 
by linear superposition: 


_ Wx . 2x . 3x . 4nx 
y=asin + 6 sin + y sin + 6 sin + etc. 
a a a a 


Unlike Euler, Bernoulli thought that all possible curves assumed by the vibrating 
string could be obtained in this way. It is interesting to note that in 1728 Bernoulli 
solved a linear difference equation by taking linear combinations of certain special 
solutions. However, he was unable to extend this idea to the solutions of ordinary 
linear differential equations; Euler did this around 1740, as we mention in Chapter 14. 
Finally in 1748, Bernoulli once again proposed this idea to solve a linear partial 
differential equation, but this time he gave a physical argument. He apparently saw no 
need here for the differential equation and thought that the mathematics only obscured 
the main ideas. This led to further discussion and controversy, mainly involving 
d’ Alembert and Bernoulli. 

It seems that these discussions led Euler to further ponder on the problem of 
expanding functions in terms of trigonometric series. In a paper written around 1752, 
Euler started with the divergent series’ 


* Truesdell (1960) p. 250. 

5 D. Bernoulli (1753a) and (1753b). 

6 Bernoulli (1982-1996) vol. 2, pp. 49-64. 
T Bu. I-14 p. 584. E 246. 
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1 
cos x + cos 2x + cos Se eS 


and after integration obtained the formulas 


Be od Me oe Sate gh oh 
Gon ee ia Base atone Ae x7 ; 


By? = gi? = 008 4 = 35 008 Dx + 5 608 Br — 00s Art 3 

He gave no range of validity for the formulas, but two decades later D. Bernoulli 
observed® that these results were true only in the interval -7 < x < a. Euler 
also continued the integration process to obtain similar formulas with polynomials 
of degree 3, 4 and 5; obviously, the process could be continued. The polynomials 
occurring in this situation were the (Jakob) Bernoulli polynomials. Neither Bernoulli 
nor Euler seems to have noticed this. It appears that the Swiss mathematician Joseph 
Raabe (1801-1859) was the first to show this explicitly, around 1850.? Recall that 
the Fourier expansion of the Bernoulli polynomials also follows when Poisson’s 
remainder in the Euler—Maclaurin formula, dating from the 1820s, is set equal to the 
remainder derived by Jacobi in the 1830s. 

In 1759, Lagrange wrote a paper on the vibrating string problem!° in which he 
attempted to obtain Euler’s general solution with arbitrary functions by first finding 
the explicit solution for the loaded string and then taking the limit. The equations of 


motion in the latter case were 
d* yx 
ae = (vert —2ye + Ye-1), = 1,2,...,0, (19.7) 


and were first obtained by Johann Bernoulli in 1727. Euler studied them in a slightly 
different context in 1748 and obtained solutions by setting 


A es 
% = Ap cos ——= 
: aes 
and finding 
‘ ra 
p =sin ———-, r=l.,...,n 
2(n +1) 


and the value of Ax from the corresponding second-order difference equation. 
Lagrange solved (19.7) by writing the equations as the first-order system 


dy dux 2 
—— = , SSS —2 = a c— a yee 19.8 
Fie aoe (ve+1 Yk + Yk-1) (19.8) 


8 D. Bernoulli (1982-1996) vol. 2, pp. 119-121. 
9 Raabe (1850). 
10 Lagrange (1867-1892) vol. 1, pp. 72-90. 
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In the course of this work, Lagrange came close to deriving the Fourier coefficients in 
the expansion of a function as a series of sines. Instead, he took a different course, 
since his aim was to derive Euler’s general solution rather than a trigonometric 
series. 

Surprisingly, as early as 1757, while studying the perturbations created by the sun, 
Alexis-Claude Clairaut gave the Fourier coefficients in the case of a cosine series 


expansion .!! He viewed the question of finding the coefficients Ap, A,,A2,... in 
[o,@) 
f(®) = Ao +2 > Am cos mx (19.9) 
m=1 


as an interpolation problem, given that values of f were known at x = 


2n An 6m 
ee eo oe Paee found 


june 2m eee 2m 2mni1 
wong i mete ) eos 2 


and then let k — ov, to get 


1 20 
An = —— f(x) cos nx dx. (19.10) 
20 0 
Twenty years later, Euler derived (19.10) directly by multiplying (19.9) by cos nx and 
using the orthogonality of the cosine function. !* 

Joseph Fourier (1768-1830) lost his parents as a child; he was then sent by the 
bishop of Auxerre to a military college run by the Benedictines. Fourier’s earliest 
researches were in algebraic equations, and he went to Paris in 1789 to present his 
results to the Academy. He soon became involved in revolutionary activities and 
gained a reputation as an orator. In 1795, Fourier began studying with Gaspard Monge; 
he soon published his first paper and announced plans to present a series of papers on 
algebraic equations. But Monge selected him to join Napoleon’s scientific expedition 
to Egypt. When Fourier returned to France in 1801, Napoleon appointed him an 
administrator in Isre. Fourier ably executed his duties, but found time to successfully 
carry out his difficult researches in heat conduction, presented to the Academy in 
1807. His work was reviewed by Lagrange, Laplace, Lacroix, and Monge; Lagrange 
opposed its publication. Perhaps to make up for this, the Academy then set a prize 
problem in the conduction of heat, won by Fourier in 1812. 

It is not clear whether Euler thought that f(x) in (19.10) was an arbitrary 
function. But in his 1807 work on heat conduction, Fourier took this view explicitly. !* 


1 CJairaut (1759) p. 545. 
12 Eu. 1-16 pp. 311-332. E 703. 
13 See Gratta-Guinness (2003). 
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He translated a physics problem into the mathematical one of finding a function v 
such that 


dx? * dy? ~ 


and v(0, y) = v(v, y) = 0, v(x,0) = f(x). By a separation of variables, Fourier found 
v to be given by the series 


my , WX Any 2UX _3ny , 30x 
v=ayje + sin —-+a2e +r sin ——+a3e ~* sin ——-+-:.-, 
r r r 
with the coefficients aj,a2,a3,... to be obtained from 
_ WX . 2mx . 3mx 
f(x) =a, sin + a2 Sin + a3 Sin free, (19.11) 
r r r 


Fourier discussed three methods for deriving these coefficients. In one approach, he 
converted equation (19.11) into a system of infinitely many equations in infinitely 
many unknowns a1,d2,a3,.... He also considered problems that reduced to cosine 
series and to series with sines as well as cosines. In his 1913 monograph on such 
systems, Les systémes d’équations linéaires the Hungarian mathematician Frigyes 
Riesz (1880-1956) wrote that Fourier was the first to deal with linear equations in 
infinitely many unknowns.'* Fourier gave two other methods for determining the 
coefficients. One method depended on the discrete orthogonality of the sine function, 
and the other on its continuous orthogonality. Dirichlet later gave a brief exposition 
of Fourier’s discrete orthogonality method, explaining why the integral representation 
for dn was plausible. 

Fourier regarded the use of an infinite system of equations in infinitely many 
unknowns as important enough to first discuss a particular case. He expanded a 
constant function as an infinite series of cosines: 


1=acosy+ bcos3y+ccos5y +dcos7y + etc. (19.12) 


15 


To see briefly how he determined the coefficients a, b,c, d,...,°°> we write the 


equation in the form 


CO 
1 =) am cos(2m — 1)y. 


m=1 


He took derivatives of all orders of this equation and set y = 0 to obtain 


14 Riesz (1913) pp. 2-8. 
'5 Fourier (1955) pp. 137-143. 
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[o,e) [oe] 
1= 0 am, 0= ))Qm -1)am, 
m=1 m=1 
Ce 
0= ) Qm-—1)*am, ete. 
m=1 
He considered the first n equations with n = 1, 2,3,..., and replaced these n 
equations with a new set of n equations, taking a, = O for m > n. This new 


ledae. 


system could be regarded as n equations in the n unknowns al”, Ce 


By using the well-known formula now known as Cramer’s rule, Fourier calculated 
the Vandermonde determinants appearing in this situation to find that 


1) = 323 5:5 Gn=DOn-V 
} 2-4 4-6 (2n —2)(2n) ’ 


with a similar formula for a” ) He assumed that an a An as n — oo. This, 
by Wallis’s formula, gave him a; = 4 and in general ay, = (—1)"~! By 


substituting these a,, back in (19.12), he obtained 


A 
(Qm—1)z° 


ud F ebR Ate ue = Coe ep Oe =e 
— = COS cos } COs COs — COS — etc. 
4 HO See gece Mane ade ge Eee ey ee 


Fourier did not discuss the validity of his method. Interestingly, according to Riesz, 
the question of the justification of this process was first considered by Henri 
Poincaré (1854-1912) in 1885. Poincaré’s attention was drawn to this problem by 
a paper of Paul Appell in which Appell applied Fourier’s method to obtain the 
coefficients of a cosine expansion of an elliptic function. Poincaré gave a simple 
theorem justifying Appell’s calculations.!® A year later he wrote another paper on 
the subject, “Sur les Determinants d’ ordre Infini.” 

The term infinite determinant was introduced by the American astronomer and 
mathematician G. W. Hill (1838-1914) in an 1877 paper on lunar theory. In this paper, 
he solved the equation 


by making the substitution 


Ce 
w= So dyer 


n>=—-C 


and determining b, from the infinite system of equations 


[o0) 
S> On-Kbk — (n +.¢)?bp =0, n= —00,..., +00. 


k=—0O 


16 See Riesz (1913) pp. 20-24 for references. 
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Hill employed a procedure similar to that of Fourier and once again Poincaré devel- 
oped the necessary theorems to justify Hill’s result.!’ Poincaré’s work was generalized 
a decade later by the Swedish mathematician Niels Helge von Koch (1870-1924) and 
this was the starting point for his countryman Ivar Fredholm’s (1866-1927) theory of 
integral equations. Fredholm in turn provided the basis for the pioneering work in the 
development of functional analysis by David Hilbert and then Riesz, with significant 
contributions from others such as Erhard Schmidt (1876-1959).!® They created the 
ideas and techniques by which linear equations in infinitely many variables could be 
treated by general methods. The valuable 1913 book by Riesz, one of the earliest 
monographs on functional analysis, contains an interesting history of the topic. Surely 
Fourier could not have foreseen that his idea would see such beautiful development. 
On the other hand, he must have considered it worthy of attention, since he included 
the long derivation of the formula for the Fourier coefficients by this method when he 
was well aware of the much shorter method using term-by-term integration. 


19.2 Euler: Trigonometric Expansion of a Function 


In a very interesting paper presented in the Petersburg Academy in 1750, 
“De Serierum Determinatione seu Nova Methodus Inveniendi Terminos Gen- 
erales Serierum,’!? Euler used symbolic calculus to expand a function as a 
trigonometric series. He also applied the discoveries he had made a decade earlier 
on solving differential equations with constant coefficients. Given a function X, his 
problem was to determine y(x) such that 


y(x) — y@ — 1) = Xx). (19.13) 
He viewed this as a differential equation of infinite order: 


dy Oe Ih ey 
dx 1-2 dx? 1-2-3 dx 


= NY 


He noted that if d”y/dx" was replaced by z”, then the left-hand side could be 
expressed as 


2 3 
z z 5 


er Per a 
ome Pe ae ee . 


He observed that the factors of 1 — e~* were z and z? + 4kkaam for k = 1,2,3,.... 
Hence, dy/dx and d*y/dx* + 4k?x*y were factors of the differential equation. 
The solution of the differential equation corresponding to dy/dx was given by 
y = f X dx, while the solution corresponding to d’y/dx? + 4kkm was given by 


'7 ibid. 
18 See Dieudonné (1981) pp. 75-120. 
19 Bu. 1-14 pp. 463-515. E 189. 
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y = 2(cos 2kz cos 2kmx — sin2kz sin 2krx) / X cos 2kax dx 
+ 2(cos 2km sin 2kmx + sin 2kz cos 2k x) / X sin2krx dx, 
and since sin2kaz = 0, cos2kz = 1, Euler could write the complete solution 
y= / X dx +2cos2rx X cos 2x dx + 2cos 4x / X cos4axdx+--- 


+2sin2nx f Xsin2exdx +2sindrx f Xsindrx de t-->. 
(19.14) 


Although (19.14) appears to be a Fourier series, the integrals are indefinite, so it is 
not. If the integrals were on the interval [0, 1], then the right-hand side would be the 
Fourier series of X, not of y. 


19.3 Lagrange on the Longitudinal Motion of the Loaded Elastic String 


In his study of the vibrating string, Lagrange considered the situation in which the 
masses were assumed to be at a discrete set of points so that he could express the rate 
of change with respect to x in terms of finite differences.*° He wrote the equations in 
the form 


dy du 2 
A yp. = 6 259) 1); 19.15 
AR a (Ye+1 — 2K + Ye-1) ( ) 
where k = 1,2,...,m—1 and yo = ym = 0. His idea was to determine constants M;, 


N; and R such that 


m—-1 m—-1 
y (My dug + Nx dyx) = > (Neve + C7 Mi (vee — 2¥% + Ye-1)) at 
k=1 k=1 


would be reduced to dz = Rz dt. This required that 


R(Myux + Neve) = Neve + C? Mees — 2 + ye-1), & = 1,2,...,m—1, 
(19.16) 


or 


RM, = Ny, RNa = C?(Me41 — 2Me + Mg-1). 


20 Lagrange (1867-1892) vol. 1, pp. 72-90. 
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This meant that M; satisfied the equation 
R2 
Mr41 — (4 +2) M, + My; = 0. (19.17) 


Lagrange set M, = Aa‘ + Bb*, so that a and b were roots of 
R2 
x? ( 2) x+1=0. 
C2 


2 


R 
ab=1 and CS dra 


Thus, 


Note that because of the restriction on yo and y,,, Lagrange could assume without loss 
of generality that Mo = M,, = 0. He also set M; = 1. From this it followed that 
A+B =Oand Aa + Bb = 1. With these initial values, he could find the constants 
A, B in M;. Thus, he could write 


k k m m 
a“ —b a” —b 
M, = d —W—=0, 
i er, 
yielding him m — 1 pairs of values a, = ee and by, = crn forn = 1,2,...,m—1. 


Corresponding to these were m — 1 values of M and R: 


knxi _ kari : knx 
m™ ™ sin (* 
Mkn = < nmi nti Cin i (19.18) 
em —e@ m sin () 
Ry = +2iC sin (=). n=1,2,...,.m—1. (19.19) 
2m 
For these values he had the corresponding equations 
m1 
zn = Ryn dt where z= Y> (Mende + RnMinyx)- (19.20) 
k=1 
The solution of the differential equation for z, yielded 
zn = F,e*" 
with F,, a constant. Next he set 
m1 
Zn = >> Mine (19.21) 


k=1 
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so that ay = vg implied that 


dZy 


acs RyZn = Fne®"'. (19.22) 


Lagrange expressed the constant F,, as 2R,K,, so that he could solve this differential 
equation in the form 


Tp K eh eh, oo Fat, (19.23) 


where L, was a constant of integration. Here recall that (19.22) can be solved by 
multiplying it by the integrating factor e*’. By substituting the value of R, from 
(19.19), he could write Z, in terms of the sine and cosine functions 


_ (nn sin (2Ct sin (5)) 
Zn = Py cos (2c sin (=) + On ee Gy . (19.24) 


Then, Z, being known, the problem was to determine yg from (19.21); after 
substituting the value of My, from (19.18), (19.21) took the form 


m—1 
nw kn 
Zn Sin — = ) ye sin —., m=1,2,...,m—1. (19.25) 
m m 
fal 
The next step was to obtain the m — | unknowns yj, y2,...,¥m—1 from these m — 1 


equations. Several years before Lagrange, in 1748, Euler had encountered this system 
of equations in his study of the loaded elastic cord.”! He was able to write the solution 
in general after studying the special cases where m < 6. He saw that the result 
followed from the discrete orthogonality relation for the sine function: 


m1 

k k 1 
Se sin — sin = dup (19.26) 
| m m 2 


for which he did not provide a complete proof. But Lagrange gave an ingenious proof 
of (19.26) and obtained 


gt _ nm . nj 
y= 7 > Zn Sin oe sin at (19.27) 
n=1 
by multiplying (19.25) by sin (=), summing over n and applying (19.26). In a 


later paper, Lagrange observed that the analysis involved in moving from (19.25) to 
(19.27) also solved an interpolation problem related to trigonometric polynomials. 
Specifically, given the m — 1 values 


21 Bu. 2-10 pp. 98-131. E 136. 
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of a function f (x), the problem was to find a trigonometric polynomial 
a; sin x + a2 sin 2x +---+a,_1sin(m — 1)x (19.28) 


passing through m — 1 points (2, f ()) ,k =1,2,...,m— 1. By an application of 
(19.26), it was clear that 


ys 2) 
Man = 2 sin =f (=) - 2 sin = 7 ( =) 
m m m 


m 


go WES an 7 (S “*), (19.29) 
m 


m 


and one obtained the coefficients of the trigonometric polynomial interpolating f(x). 
We reproduce Dirichlet’s 1837 proof *” of the orthogonality relation (19.26), since 
it is more illuminating than Lagrange’s complicated though clever proof. The same 
method was clearly described in Fourier’s 1822 book on heat. The idea was to 
apply the addition formula for the sine function; note that this addition formula 
is also used for the integral analog of (19.26). First note that by the addition 
formula 
knx kpa k(n — p)x k(n+ p)x 
s cos : 


2 sin sin = co 
m m m m 


So when n ¥ p, twice the sum in (19.26) is given by 


3 ( k(n — p) k(n + pm 
COs Cos 
k=1 


m m 


(19.30) 
_ sin (m — 5)(n— p)= sin (m —5)(n+ p)= i 
— Qsin(n = p) = 2sin(n+p)yH 


since each expression is either —} or 0, according as n — p is even or odd. To sum the 


series in (19.30) Dirichlet employed the formula 


sin (2s + 1)0 
1+2cos 20 +2cos 40 +---+2cos 256 = aaa eee 
sin 


also provable by the addition formula for the sine function. Dirichlet pointed out that 
(19.28) and (19.29) strongly suggested that a function f(x) could be expanded as a 
Fourier series. Observe that a, can be expressed as 


22 Dirichlet (1969) vol. 1, pp. 139-142. 
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zw . (m—1)n0 (“—*) 
sin f ‘ 
m m m 


and when m — ov, the right-hand side tends to 


2 Iv 
=f sin nx f (x) dx. 
wT JO 


Thus, f(x) = aq, sin x +a) sin 2x +---+a,sinnx+---, with 


2: TT 
an = = | sin nx f (x) dx. 
JO 


Observe here that Lagrange missed this opportunity to discover Fourier series, partly 
because he was focused on obtaining the results of d’Alembert and Euler and 
partly because he did not think that functions could be represented by such series. 


19.4 Euler on Fourier Series 


In 1777, Euler submitted a paper to the Petersburg Academy containing a derivation 
of the Fourier coefficients of a cosine series.?> This was the first derivation of the 
coefficients using the orthogonality of the sequence of functions cos nx, n = 1,2,.... 
Euler’s paper was published in 1798, but its contents did not become generally known 
until much later; a half century afterward, Riemann thought that Fourier was the first 
to give such a derivation. Euler expanded a function ® as a trigonometric series, 
®=A+Bcos ¢+Ccos 2¢+--- and gave the coefficients as 


1 f* 2. [* 2 [7 
Azo | GD dd, p=-| ®d¢ cos ¢, c=-f ® dd cos 2¢, .... 

oP) T JO wT JO 
(19.31) 


1 


His argument was that since f dd cos id = | 


¢ =0to d= 7, he would get 


sin ig@ = 0, on integration from 


/ Odo = An. 
0 


Next, by the addition formula for the cosine function, when i + A, 


dd cos id cos 4g = 59 (cos (i —A)d + cos (i + AJA), 


7 ; _ sinG-A)d | sinG +Ao # 
[a0 co i6 cos 20 = id) | 


= 0. 


0 


23 Bu. 1-16 pp. 311-332. E 703. 
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And when i = A, 
Tw 


[" ca.cosio? = 40+ Lsinaro] = 4a 
: od (cos id) mae Tra He a 


for 


1 


1 
5 + 5 cos (2i@). 


(cos ig)? = 


Hence the coefficients A, B,C, D,... were as given in (19.31). 
In this paper, Euler included a proof of the well-known recurrence relation 


[ dd (cos ¢)* = ies is dd (cos ¢)*~?. 
0 A Jo 


We mention that Euler wrote cos ¢* for (cos o)*. Observe that, though he was 
well aware of the integration by parts formula in the standard form f PdQ = PQ — 
{ QdP, he usually worked it out in a slightly different way. For example, to prove the 
recurrence formula, Euler started with 


fw cos* @ = f sind cos*~! ot+ «| do cos*~? d, 
where f and g had to be determined. He differentiated to get 
cos* @ = ve cos* @ — fa-l sin? op cos*~? o+eg cos*~? Q. 
Since sin? ¢ = 1 — cos’ ¢, 
cos* @ = Af cos* @ — fa-l cos*~* o+eg cos*~? d, 


for which Euler required that f = i andg = fA-—l)org= Ant Hence 


1 rA-1 
i do cos* ¢ = 5 sin @ cos*—! ot+ —/ di cos*~? od. 


Finally, he took the integral from 0 to zr. In this paper, as in some others, Euler used 
the notation 0¢ instead of dd. 


19.5 Fourier and Linear Equations in Infinitely Many Unknowns 


Fourier rediscovered Euler’s derivation of the Fourier coefficients. Nevertheless, 
Fourier sought alternative derivations to convince the mathematical community of the 
correctness of the Fourier expansion. He also presented several derivations of Fourier 
expansions of specific functions. In spite of these efforts, his theory encountered 
a certain amount of opposition, mainly from the older generation. In section 6 of 
the third chapter of his famous book, Théorie analytique de la chaleur, Fourier 
considered the problem of determining the coefficients in the sine expansion of 
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an odd function.”+ He reduced this problem to that of solving an infinite system 
of equations in infinitely many unknowns. It is interesting to see how beautifully 
Fourier carried out the computations; he started with a sine series expansion of an 
odd function 


g(x) =asinx + bsin 2x +csin 3x +dsin 4x +---. (19.32) 


He let A = ¢/(0), B = —$”(0), C = (0), D = —¢ (0), ..., so that by 
repeatedly differentiating (19.32), he had 

A=a4+2b4+3c+4d+5e+-:--, 

B=GEO DEP oR epen, 
C=at+Pb4+3e4+Pd+Se+-::, (19.33) 
D=a4+2'b4+3'c+4'd+5'e+---, 
E=a+2b4+3°e+4d4+5°e+-:-, 


and so on. He broke up this system into the subsystems 
a3 + 2b3 + 3c3 = A3, 
a3 + 2°b3 + 3°c3 = Bs, (19.34) 


a3 + 2°b3 + 3°c3 = C3, 


a2 + 2b2 = Ao, 
aa + Bb» = B, 


a4 + 2b4 + 3c4 +4d4 = Aa, 
a4 + 23b4 + 334 + 474 = Ba, 
a4 + 2°b4 + 354 + 4>dy = Ca, 
a4 + 2"b4 +3'cq +4"dq = Da, 
as + 2b5 + 3c5 + 4d5 + Ses = As, 
as + 23b5 + 3°c5 + 43ds + 57e5 = Bs, 
as + 2°bs + 3°c5 + ds + 5° es = Cs, (19.36) 
as + 2’bs + 3’c5 +4’ds + 5/e5 = Ds, 
as + 2°bs + 3°cs + 4°ds + 5°e5 = Es, 


(19.35) 


and so on. Fourier’s strategy was to solve the first equation for a;, the second for 2b2, 
the third for 3c3, and so on. He wrote that the equations could be solved by inspection, 
meaning that they could be obtained by Cramer’s rule, since the determinants in the 
equations were Vandermonde determinants. He also established the recursive relations 


24 See Fourier (1955) pp. 168-185. 
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connecting aj—, with a;, bj;—, with b;, cj—1 with cj, etc., and similarly with the 
right-hand sides of the equations, A ;, B;, C;, etc. He assumed that as j — oo, 


aj > a, bj > b, cj > c,eXc., 
and 
Aj > A, Bj > B, C; > C,ete. 


To find the recursive relations, Fourier eliminated es from the last five equations 
to get 
as(5* — 17) + 2b5(5* — 27) + 3¢5(5? — 3?) + 4d5(5? — 47) = 57A5 — Bs, 
as(5* — 17) + 23bs(5? — 27) + 33c5(57 — 37) + 43d5(5* — 42) = 5° Bs — Cs, 
as(57 — 17) + 2°bs(5* — 27) + 3°c5(5* — 37) + 4°ds(57 — 4°) = 5°C5 — Ds, 
as(5* — 17) + 27bs(5* — 27) + 3%c5(5* — 37) + 4/d5(5* — 4°) = 5*Ds — Es. 
(19.37) 


Fourier then argued that for this system to coincide with the system of four equations 
in (19.35), he must have 


ag = (5° —17)as, bg = (57 —27)bs, cg = (5° — 3°)e5, dg = (5° — 4") ds, 
(19.38) 

Ag =5°As— Bs, Bs =5°Bs—Cs, Ca=5°C5—Ds, D4 =5°Ds — Es. 
(19.39) 


We remark that he wrote out all his equations in this manner, noting that this reasoning 
would apply in general to the m x m system of equations. We now write his formulas 
in shorter form. From the relations (19.38) and (19.39), it is evident that 


a Sa =). FS 234i vs 
bj-1=b;(77-2°), 7 =3,4,5,..., 


CAS Sy FSA SOs oe) 
dj-1=dj(j°-4), j=5,67,... 

and also 
Aj1=j?Aj—By, j =2,3,4,..., 
Bj1=j°Bj—Cjy, 7 =3,4,5,..., elas 


Cj1=j°C;-—D;, j =4,5,6,..., 
Dj-1=j’Dj;-— Ej, j =5,6,7,.... 
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As we noted before, Fourier assumed that a; > a,b; > b,..., Aj > A,... as 
Jj — oo. So from (19.40), he could conclude that 


BG = G2 14 =a 
b= 2 19.42 
P= PVA 2762 S22) os oes 
C3 


~ (42 — 32)(52 — 32)(62 — 32)...’ 
and so on. Similarly, a repeated application of (19.41) gave him 
Ay = Ap?’ — Bo, Ay = A327 - 3° — B32? + 3°) + C3, 


Ay Ages BAe BO 8 04 Ae Be) 0? 9? ae De ete, 


To understand Fourier’s next step, one may divide the first value of A; by 27, the 
second value of A; by 2? - 37, the third by 2? - 37 - 47, and consider the form of the 
right-hand side. So by dividing the ultimate equation A, by 27-37-47 .5?---, Fourier 
obtained by equations (19.34) and (19.42) 


A\ (= 41) a= DG = De =) Hl) 
Cy Oe Ve) oe 92.32 Ado 52a. 


! | 1 | ! | | ! | ! | ! | 
SaaS Nga ga Pigg, iC Nop ae peage:  adge 


1 1 1 
D(aaptaaetegeet)t 
= A= BP, +CQ:~ DRI + ES; 


(19.43) 


We note that by P;, Q1, R1,... Fourier meant the sums of products of on a z see 


taken one, two, three,... ata time. This gave him the value of a in terms of A, B, C, D 
etc. To find the values of b, c, d,... in a similar manner, Fourier solved the second 
system and third systems in (19.34) to find 2b2 and 353. Similarly, he solved (19.35) 
for 4b, etc. to arrive at the solutions: 


(de 27) 
12. 32.42.52...” 
(ess Cr =30 
12.92.42.52.62...’ 
(1? — 4*)(2? — 47)? — 4?) 
12.22 .32.52.62... ’ 


A—BP.+CQ2— DR2+4+---=2b2 


A— BP3+CQ3— DR3+--- = 3c3 (19.44) 


A—BPs+CQ4—-— DR4+4+---=4d4 
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and so on. The starting points for deriving these equations were 


Db = 97) => Asli? = Bo, 
Bes 3°)? = 37) S Age 40? = Be 7) EC, 
4d4(1* — 4°)(2* — 4°)(3* — 4°) 
SS Ay 8? By ee 2 8 i aS Dy 
(19.45) 


and so on. As before, these relations were continued by repeated use of (19.41). Once 
again Fourier applied (19.40) to express b2, c3, d4,... in terms of b, c, d,.... Recall 
that bj > b, cj > c, dj — d, etc. He then had 


12 12 12 
A—BP,;+CQ,-—DR, cat =) (1 x) (1 zy 


(PS GP 972 art 


A— BP,+CQ2— DR2+---=2b 


12.32.43 .52... 
2 2? ia 
-2(-F)-S)(-4) 
32 3? 3? 
A— BP3+CQ3— DR; = 3e(1 7) (1 =) (1 zy 
4? 4? 4? 42 
A—BP,+CQ4— DR, oad (1 3) (1 =) (1 =) (1 ao 


(19.46) 


and so on. To compute the values of the products on the right-hand side of (19.46), 
and the values of P;, Q;, Rj, S;,..., observe that 


sin 0x x2 x? x2 
-—{1 1 1 see, 19.47 
m= (1-3) (1-3) 0-3) asan 


Fourier did not write down the details of the evaluations of the products on the 
right-hand sides of (19.46), but they are fairly simple. Note that the first product has a 
factor (1 — r) missing, the second (1 — a the third (1 — a etc. Now the value of 


the product with a missing 1 — g can be evaluated by (19.47) to be 


i sina (j + €) _ j°(-)s7! sinze (-1)/7! 
lim a = 
0 n(j +6)(I — Hs] 0 1(j FE)2i +e y 


(19.48) 


To find P;, Q;, Rj,..., he expanded the product on the right-hand side as a series 


1— Px* + Ox*— Rx® +... , (19.49) 
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so that P, Q, R,... were sums of products of 1, on a ... taken one, two, three, ... 
(respectively) at a time. He could then equate this series with the known power series 
for sin (7x), 
x * 
1 Re ea _ exe 
ak Yo! ae 
to get 
2 4 6 
ad Tw a 


Moreover, it is easy to see that 


Hence 


Pj=P- >= =, Oj;= : = Cees (19.51) 


i (= 1 x? 1 ) 
5! 1? 3! 14 


2, 4 2 
—52b= A a(3 L c(F pe =) (19.52) 


etc. Now recall that A = #/(0), -B = $’”(0), C = ¢©(0), ...; one may use the 


expressions for a,b,c, ... in equation (19.42) to get 
1 : ; wi ee (5) ie. ~ Loe? ah ve 
(1) =sin x {8'O) +6 Ola -p) te Olea -patpe) tr 
1. ; iit eA (5) ee Oe coma | 
= 5sin 2+ {8'0)-+0 O(=-3)+9°O(S-aytm)t 


1. / A i” 1 m4 1 x? 1 
+ 5sin 3x {00 +9 o(F- 3) +0 (F 32 31 =) | 
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Fourier noted that the expression in the first set of chain brackets was the Maclaurin 
series for 


1 1 1 
{om pe m4 0%) = ed) +f 


1 
uA 


Similarly, the expressions in the second and third brackets were 


1 Ly i 1 . 
~ {om 520 (a) 4 0) = 38) to-f 


1 


1 1 1 
{om ye () BPO) — oO) + ae 


To sum the expressions in chain brackets, Fourier observed that 


ls, | 4) 
s(x) = $(x) — 6") + Gb) — + 
m m 
satisfied the differential equation 
1 " — 
—55 (x) + s(x) = o(). 
m 
He noted that the general solution of this differential equation was 
x 
s(x) = Cy cos mx + Cz sin mx +msin ms | o(t) cos mt dt 
0 
x 
— mcos ms | o(t) sin mt dt. 
0 


Because #(x) was an odd function, its even-order derivatives were also odd functions, 
making s(x) an odd function. This meant that C; = 0. Hence, 


s(x) = (-1)"*!m 1 #(t) sin mt dt, 
0 


and this in turn implied 


2 TT 
adm = =| p(t) sin mt dt. 
T JO 


Thus, Fourier found the “Fourier” coefficients. 


19.6 Dirichlet’s Proof of Fourier’s Theorem 


Fourier’s work clearly demonstrated the tremendous significance of trigonometric 
series in the study of heat conduction and more generally in solving partial differ- 
ential equations with boundary conditions. As we have seen, Fourier offered many 
arguments for the validity of his methods. But when the work of Gauss, Cauchy, 
and Abel on convergence of series became known in the 1820s, Fourier’s methods 
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were perceived to be nonrigorous. Lejeune Dirichlet (1805-1859) studied in France 
with Fourier and Poisson, who introduced him to problems in mathematical physics. 
At the same time, Dirichlet became familiar with the latest ideas on the rigorous 
treatment of infinite power series. Dirichlet’s first great achievement was to treat 
infinite trigonometric series with equal rigor, thereby vindicating Fourier, who had 
befriended him. 

In 1829, Dirichlet published his famous paper on Fourier series, “Sur la conver- 
gence des séries trigonométriques qui servent a représenter une fonction arbitraire 
entre des limites données”> in the newly founded Crelle’s Journal. Eight years later 
he published the same paper in the Berlin Academy journal with further computational 
details and a more careful analysis of convergence. We follow the 1829 paper, whose 
title indicates that Dirichlet’s aim was to obtain conditions on an arbitrary function 
so that the corresponding Fourier series would converge to the function. Dirichlet 
started his paper by observing that Fourier began a new era in analysis by applying 
trigonometric series in his researches on heat. However, he noted that only one paper, 
published by Cauchy in 1823, had discussed the validity of this method. Dirichlet 
noted, moreover, that the results of Cauchy’s paper were inconclusive because they 
were based on the false premise that if the series with nth term uv; = ao converged, 
the series }* u, also converged when =a had 1 as a limit. Dirichlet produced examples 
of two series with nth terms 


CA a (Gastaets 
vn i Fe Af 


Dirichlet pointed out that the ratio of the nth terms approached 1 as n tended to 
infinity, but the first series converged and the second diverged. 
Now in article 235 of his book on heat, Fourier gave the formula 


oi == | (a) da (5 + Yreosi(x x) 7 (19.53) 


Briefly, Fourier arrived at (19.53) by starting with his result 


1 
P(x) = 500 +a,cosx +a2c0s2x +---+b,;sinx+bosin2x+---, (19.54) 
where 
us Tw 
qi = — f(a)cosiada, bj = — f(a) sinia da. 
—1 —1 


Observing that 


1 TT 
a, cosix + b; sinix = — f (a)(cos ix cosia@ + sinix sinia)da 
a 
er 
=— f(a) cosi(x — a)da, (19.55) 
ae ae 


25 Dirichlet (1829b). 
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Fourier substituted (19.55) into (19.54) to obtain (19.53). 
Dirichlet analyzed the partial sums of this series under the assumption that the 
function f(x) was piecewise monotonic. Taking n + | terms of the series and using 


sin (n+ 5) (a — x) 


2sin 5 (a —x) 


+ cos (a — x) +cos 2(a —x)+---+cos n(a—x) = 


Dirichlet represented the partial sum by 


1 ft sin (n + 4) @— 2) 
Sn(x) = = p(a) 


FG da. 
—1 2sin 5(@ — x) 


He proved that this integral converged to 


f(x +0) + f(x — 0) 
5) , 


when f satisfied certain conditions. For this purpose, he first demonstrated the 
theorem: For any function {(6), continuous and monotonic in the interval (g, /), 
where 0 < g <h < &, the integral (for 0 < g < h) 


sin iB 
[ rome 


converges to a limit as i tends to infinity. The limit is zero except when g = 0, in which 
case the limit is + f (0). We present Dirichlet’s argument in a slightly condensed form, 
for the most part using his notation. First note that we can write 


a 
/ aay, Pee (19.56) 
2 


Xx 


XT 20 (n+1)x sin x 1 
/ +f Sf sishe oh i fe acs dx = —. 
0 1 ni x 2 


Note that the integral (19.56) was evaluated by Euler in a paper of 1781. See 
Exercise 1 in Chapter 17. Since sin x changes signs in the successive intervals 


as 


[0,7], [7,27], ... and the integrand is decreasing, we can write the sum as 
n—-1 I 
ky — kz +k3 —---+(-1) har? =e (19.57) 


@+D gin x 
where ky41 = dx}. 
ni x 


The series converges and hence k, — 0 as n — oo. Now consider the integral 


h sin ip 
r= [2 soap, 
o sin B 
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where f(f) is decreasing and positive. Divide the interval [0,h] by the points 


xz 20 ra 
L Ll L 


where r is the largest integer for which the last inequality holds. J is the sum of 
the integrals on these r + 1 subintervals. On comparing two of the consecutive 
subintervals, we see that for v < r 

w+) 

“7 sin ip 


fies a sin ip _ i d 
n=" ane = |i =f Sele) Al 


Verify this by changing 6 to + + £, so that the second integral can be written as 


ca sin iB a 
ee sin (B + #) a ~) dp. 


Also, f (8) is decreasing so that 


FB) (B+ 9) 
sinB sin (B+ 7) 


Thus, 


T=)-ht+h-t+::-+£h Fh, 


where J), is defined over the interval (“, h) so that the J; are positive and decreasing 
right up to the last term J;. Next, let 


sini 
(oe z 
@=)r sin B 
UE sin 
/ a dy}. 
(v-1)a ? sin(=) 
Observe that the last integral is obtained by the change of variables y = if. As 


i — ox, this integral tends to {?” ie ay dy = ky. Next fix a number m, assumed 
for convenience to be even, and let r be greater than m. Let p, be such that 


(oo 7) < pv sf (=) and t= pK. 
i l 


Ky= 


Then 


T = (Kip) — K2p2 + K3p3 — ++» — KmPm) 
+ (Kin41Pm+1 — Km420m42 +--+) = 1m) + 1', 
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where J (m) consists of the m terms inside the first set of parentheses and J’ represents 
the remaining terms, inside the second set of parentheses. Therefore, the sum J (7) as 
i — oo converges to 


f(O)(y — kp +3 — +++ — km) = 8m f 0). 


This means that the sums J (m) and s,, f (0) can be made less than a positive number 
w no matter how small. The sum /’ is an alternating series with decreasing terms and 
hence is less than Km+1m+1; note that this converges to km+1 f (0). Thus, by (19.57), 
[I’| < km+1|f()| + w’, where w’ can be made arbitrarily small. Moreover, 


IF ~ m|<km41 and so [I — “ fO)| <wtw'+2fOkmer. (19.58) 


This proves the theorem for f positive and g = 0. If g > 0, then 


iP an? dp ‘i am? nap i: an a6. 


sin B sin B sin B 


At this point, one may conclude that both these integrals tend to 5 f (0) as i > oo. 
So Iz — 0 asi — ov. This proves the theorem for positive decreasing f. If f 
also assumes negative values, then choose a constant C large enough that C + f is 
positive. If f is increasing, — f is decreasing, taking care of that case, and the theorem 
is proved. 

Dirichlet noted that if f was discontinuous at O, then by the previous argument, 
f(O) could be replaced by f(€) where € was an infinitely small positive number. In 
his 1837 paper, he denoted f(x + €) by f(x + 0), the right-hand limit of f(t) as 
t — x. This is now standard notation. 

To prove Fourier’s theorem, break up the integral for s, (x) into two parts, one taken 
from —z to x and the other from x to z. If a is replaced by x — 26 in the first integral 
and by x + 2 in the second, then we have 


nt) = fo et DE bie 9 6) ap i SE DE pie Dyas. 


sin B sin B 


Suppose x #4 —z orm and B—x < x. The function ¢(x+28) in me second integral 
may be discontinuous at several points between 6 = 0 and 6 = *>*, and it may also 
have several external points in this interval. Denote these points by /,/',1",...,/” in 
ascending order, and decompose the second integral over the intervals (0,/), (/,1'), .... 
By the theorem, the first of these v+1 integrals has the limit (x +€)5 (i.e., (x +0) 5) 
and the others have the limit zero as n — oo. If in the first integral for s, (x) we have 
B+x > 7, then write it as 


Bre 
(x — 2B) dB rf VE yoda) as. 
2 


sin B 


> sin(2n + 1)B 
[ sin B 
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The first integral tends to (x — €)> as n — ov. A similar argument shows that 


i sin(2n + 1)B 
0 


: (x — 2B)dB tends to OGtLe, 
sin B 5) 


19.7 Dirichlet: On the Evaluation of Gauss Sums 


In 1835, Dirichlet presented a paper to the Berlin Academy” explaining how the 
. sik? 
definite integral ae el” dx could be applied to evaluate the Gauss sum pae, a 


He gave a slightly simpler version of this proof in his famous 18407’ paper in Crelle’s 
Journal, on the applications of infinitesimal analysis to the theory of numbers. In this 
paper, Dirichlet also derived the class number formula for quadratic forms and proved 
his well-known theorem on primes in arithmetic progression. 

He started his evaluation of the Gauss sum by first proving a finite form of the 
Poisson summation formula. Note that in 1826, Poisson had used such a finite formula 
to deduce the Euler—Maclaurin summation. Dirichlet began with a continuous function 
g(x) in [0,2] expandable as a Fourier series: 


CO 
Wg(x)= cot2)> cs COS SX, (19.59) 
s=1 
where 
us 
—— i, g(x) cos sx dx. (19.60) 
0 
It followed for x = O that 
(oe) 
cot2) cs = 1g(0). (19.61) 
il 
He then set 
g(x) = f(x) + f@m —x)+ fQr+x)+---+ fQh—-Da+x)+ fhm — x), 
(19.62) 
where f (x) was continuous on [0,27]. He observed that 
1 2hxr 
or / g(x) cos sx dx = f(x) cos sx dx. (19.63) 
0 0 


By using the value of g(O) from (19.62), he could rewrite (19.61) in the form 


26 Dirichlet (1969) vol. 1, pp. 237-256. 
27 ibid. pp. 410-496, especially pp. 473-479. 
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ioe) h-1 
cot+2) can (ro + f(2hr) +2) sas), (19.64) 


s=l s=1 


where cs was given by (19.63). This was the finite form of the Poisson summation 
formula employed by Dirichlet. He then considered the integral 


CO 
/ cos x7 dx = a, 
—o0o 


where a was some number to be determined. Since Euler had evaluated this integral 
in 1781, Dirichlet knew its exact value. See Exercise | in Chapter 17. Note here that 
Dirichlet’s method of evaluating Gauss sums was such that it also determined the value 


of a. He set 
zin 
Se 
2V 2x 


where n was a positive integer divisible by 4, transforming the integral to 


(oe) 
p 
1 cos (—2’) dz =2a,]—. (19.65) 
es 87 n 


Dirichlet then rewrote the last integral as a sum: 


oo 2(s+1)x n » oo Qn n ; 
ys i COs (2 ) dz= 2) cos ay 25" + z)° dz. (19.66) 


s=—00 ¥ 25H 


He observed that since n was divisible by 4, 
cos " (sn + z)° = cos "(452 + 4smz +z”) =cos (= + =<?) ; 
87 87 2 
Then, by the addition formula, he had 
cos (= Z 2°) + COS ( puss | i 2°) = 2cos (=) cos (=) ? 
2 87 2 87 2 87 
Hence, (19.65) and (19.66) could be expressed as 
Qn o° Qn 
[2. 
[ cos (—-:’) dz+ >>, cos (=z) cos (s=) dz=2a = 


Dirichlet substituted nz = 2x in this formula to obtain 


nw x2 00 nw x2 
if cos (=) dx +2 ) / cos (=) cos sxdx =avV2nz. (19.67) 
0 Qn =] 0 Qn 
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Since m was an even number expressible as 2h, the sum on the left-hand side coincided 
with the sum on the left-hand side of (19.64) when f(x) = cos(—). By combining 


(19.64) and (19.67), Dirichlet arrived at the formula 


Roy 
WN? 276 | 38 oe 2n 
cos 0 4 cos (5) a 2 Does ae = (19.68) 
Ss 
He then observed that 
2 2 
cos ee = cos(n — sy, 
n n 
and therefore (19.68) could be expressed in the simpler form 
Qn 2n 
“cos se =a,/—. (19.69) 
= n a 


Dirichlet next remarked that the value of a was independent of n so that by choosing 


n = 4, he could write 
een ee or a= ee 
XN 2 


and therefore he could express the Gauss sum as 


n-1 
= COs 
s=0 


2 

2 
See og (19.70) 
n 


Operating in the same manner with re sin x? dx, he arrived at 


2, 
2 

Y- sin = Vin. (19.71) 
n 


Dirichlet pointed out that the sums (19.70) and (19.71) could be similarly evaluated 
for n of the form 444+ 1, 44+2, and 44.+3. However, it was possible to obtain these 
sums in a different way. For that purpose he defined, for positive integers m and n, 


n—1 > 
2ms*ri 
) ern =4d¢(m,n). 


s=0 
He then wrote 
o(m,n) = d(m',n) ~~ when m = m'(mod n); (19.72) 


d(m,n) = d(c’m,n) when c was prime to n; (19.73) 


o(m,n)o(n,m) = (1,mn) — when m and n were coprime. (19.74) 
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Dirichlet proved (19.74) by observing that 


n—1m-1 > 


o(m,n)o(n,m) = s - ae see 


s=0 t=0 


(m*s*4tn*t*)2ni 25240212) 2m: 
= => y e mn 
n—1m-1 


=> Ye (stat)? ani 


s=0 t=0 


Since m and n were chosen relatively prime, Dirichlet argued that ms +nt assumed all 
the residues (mod mn) as s and t ranged over the values 0, 1...,2 — 1 and 0, 1,..., 
m — 1, respectively. Therefore, 


mn—1 


(m,n)o(n,m) = a oars (19.75) 


and Dirichlet’s proof of (19.74) was complete. Note that Gauss gave a similar 
argument in 1801, though it was published later.78 Dirichlet then observed that for 
n, a multiple of 4, (19.70) and (19.71) implied 


p(n) = (1 +i)Jn. (19.76) 
And for odd n, (19.74) and (19.76) gave 
$(4,n)b(n,4) = 6(1,4n) = 20 +i)Vn. (19.77) 


Moreover, by (19.73), (4,2) = @(1,n) when n was odd, and by (19.72) 
o(n,4) = (1,4) or 6,4), depending on whether n = 44+ 1 orn = 4yu +3. 
Since 


g(1,4)=2(14+i), and (3,4) =2(1 — i), 
Dirichlet could conclude that 
o(,n)=JSn, n=4u4+1; O(n) =iVn, n=4y4+3. (19.78) 


Finally, when n = 4 + 2, he argued that 5 and 2 were relatively prime, so that by 


(19.74) 
6(2.5)o(5.2) =(1,n) and 6(5.2) =P 0) 0, 


28 Gauss (1863-1927) vol. 2, pp. 11-45. 
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Thus, 
g(,n)=0, n=4u4+2. 


Gauss gave a proof of the quadratic reciprocity theorem by using the values of the 
Gauss sums. Dirichlet repeated these arguments in his papers, though in a simpler 
form. In fact, in his papers and lectures Dirichlet presented many number theoretic 
ideas of Gauss within an easily understandable approach. For example, he published a 
one-page proof of a theorem of Gauss on the biquadratic character of 2. It is interesting 
to note that the British number theorist, H. J. S. Smith (1826-1883), presented this 
result in the first part of his report on number theory published in 1859; he wrote in a 
footnote:7? 


The death of this eminent geometer in the present year (May 5, 1859) is an irreparable loss 
to the science of arithmetic. His original investigations have probably contributed more to its 
advancement than those of any other writer since the time of Gauss; if, at least, we estimate 
results rather by their importance than by their number. He has also applied himself (in several of 
his memoirs) to give an elementary character to arithmetical theories which, as they appear in the 
work of Gauss, are tedious and obscure; and he has thus done much to popularize the theory of 
numbers among mathematicians — a service which it is impossible to appreciate too highly. 


Noting Smith’s remark on the importance, rather than the number, of Dirichlet’s 
results, we observe that Gauss made a similar comment when he recommended 
Dirichlet for the order pour le mérite in 1845:39 “The same [Dirichlet] has — as far 
as I know — not yet published a big work, and also his individual memoirs do not 
yet comprise a big volume. But they are jewels, and one does not weigh jewels on a 
grocer’s scales.” 


19.8 Schaar: Reciprocity of Gauss Sums 


In 1850, the Belgian mathematician Mathias Schaar (1817-1867) derived a remark- 


able reciprocity formula for Gauss sums:>! 
q ni ou _ aghi aol Qn pki 
Pea er ey ee, (19.79) 
P k=0 k=0 


where 2p and gq are relatively prime integers. This formula contains the value of the 
Gauss sum (19.76) and also implies the law of quadratic reciprocity. We present a 
streamlined version of Schaar’s argument. In this context it is interesting to observe 
that in his evaluation of the Gauss sum, instead of working with 


se 5 
f(x) =cos —— and f(x) = sin — 
Qn 2n1 


29 Smith (1965b) p. 72. 
30 Duke and Tschinkel (2005) p. 18. 
31 Schaar (1850). 
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separately, Dirichlet could have used the function 


The right-hand side of (19.79) suggests that one should consider the function aa, 


Qnix? 


ix? 
f(x) =e or, indeed, the function f(x) =e 2 
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2 


Thus, instead of (19.59) and (19.60), take a continuous function g(x) on [0, 1] to find 


that 


= 1 
g(x) = De ae where a= f g(xje2™* dy: 


when x = 0, 


k=—0o 0 


g0)= )) ce. 


k=—0O 


Now if we take f(x) continuous on [0,qg], and g(x) = f(x) + f(x+ 1) 
f(x+q-—1), then 


1 . qd ; 
Ck = / real aca dx = / fxjeu dx. 
0 0 


Moreover, (19.64) gives us 


lee) q q-1 
> i f(xjerm™ dx = S f(s). 
s=0 


k=—0co 


Qnipx2 


Taking f(x) =e 4% , we then get 


2Qnipk2 


oo 1 az 
=q a / e2tipat +2mivqt dt 
aa 0 


v=—0OO 


1 
; 2, vt 
/ e2tipat +>) dt 


ace migv2 1 Oni t vy\2 
=q > ¢ % / er ipa tt ay ae. 


0 


(19.80) 
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Applying the change of variables s = t + 35 in (19.80), we see that 


ee et ae re 
S=q Die» ie gree ds: (19.81) 


v=—0o 2p 


Schaar next broke up the sum in (19.81) into 2p sums by summing over 
v =kmod2p,k =0,1,2,...,2p — 1 to obtain 


[e,e) 2p- 2 v+i+ & 
— taps" 2p . 2 
a2 / Z me f e2™ Pas ds 


U+95 


igi i 2ni pqs? 
=q e »? en Pds” ds, (19.82) 
—OO 


Finally, we see that since 


/ ; 
—oo 2pq ; 


get 2mipk q ape _ xigk? 
>» e 4 = = ap eF > e 2p 
k=0 i k=0 


completing our summary of Schaar’s proof of the reciprocity formula for Gauss sums. 


19.9 Exercises 


(1) Solve the equation 4 dy a fe = 0 by assuming v = F(x) f(y) to obtain 


F(x) =e™, fo) = a Let v = #(x,y) and assume the boundary 
conditions (x, + ss) = 0 and ¢(0, y) = 1. Show that 


4 1 1 
o(x,y) = —(e * cos y — ne cos3y + nae cos 5y —---). 
1 


See Fourier (1955) pp. 134-144. 
(2) Show that 5 = arctanu + arctan (4 Let u = e!* and expand arctanu and 
arctan ( 1 as series to obtain 


is Lee eae us 
— =cosx — = cos = cos5x —--- 
4 XxX 3 X 5 X 


See Fourier (1955) p. 154. 
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(3) Let a denote a quadratic residue modulo p, a prime, and let b denote a quadratic 
nonresidue. Show that 


(p- 1)? 
a 


(lp) =14+2.0¢? =i ep, 


where the sum is over all the residues a. Show also that 


a2mni 


o(m, p) = (=) oa. =142yce?, 


where (2) denotes the Legendre symbol. Deduce that 


2mmia 2mmib m : (p-1)2 
a =(=)i" ve, 


P 


See Dirichlet (1969) pp. 478-479. 
Suppose p and q are primes. Use $(p,q)¢(q, p) = @(., pq) and the results in 
the previous exercise to prove the law of quadratic reciprocity: 


()(f)=r0r 
q P 


This proof originates with Gauss’s 1808 paper, “Summatio Quarundam 
Serierum Singularium.” For the derivation discussed in this exercise, see 
Dirichlet and Dedekind (1999) pp. 206-207. 


(4 


wm 


19.10 Notes on the Literature 


Truesdell (1960) is a detailed history of the mechanics of flexible or elastic bodies 
from 1638 to 1788. He includes a discussion of those aspects of the works of Euler, 
d’Alembert, D. Bernoulli, and Lagrange that led to the consideration of trigonometric 
series in such problems. For Wiener’s treatment of Euler’s difference equation (19.13), 
see Wiener (1979) vol. 2, pp. 443-453. Wiener’s paper contributed to the effort to 
make operational calculus rigorous. Yushkevich (1971) and Bottazzini (1986) chapter 
1 deal with the development of the concept of a function in connection with trigono- 
metric series. Fourier’s 1807 memoir on heat conduction was never published by the 
French Academy; Fourier’s famous book of 1822 was a reworking of this memoir. 
However, Grattan-Guinness (1972) has helpfully presented us with the original 1807 
memoir. For a mathematical biography of Dirichlet, see Merzbach (2018) and for a 
more brief biography, see Elstrodt (2005) in Duke and Tschinkel (2005). 
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The Euler—Maclaurin Summation Formula 


20.1 Preliminary Remarks 


The Euler—Maclaurin summation formula is among the most useful and important 
formulas in all of mathematics, independently discovered by Euler and Maclaurin in 
the early 1730s. 

The Euler—Maclaurin summation formula arose out of efforts to find approximate 
values for finite and infinite series. During the 1720s, the series ¢(2) = >>, oa 
received a good deal of attention. Since the exact evaluation of this series appeared 
to be out of reach at that time, several mathematicians devised methods to compute 
approximations for this series. Stirling found some ingenious methods for transform- 
ing this and similar slowly convergent series to more rapidly convergent series. See, 
for example, (10.54) for one such method. In his Methodus Differentialis of 1730, 
Stirling computed ¢(2) by three different methods, one of which gave the correct 
value to sixteen decimal places. Around 1727, Daniel Bernoulli and Goldbach also 
showed a passing interest in the problem by computing ¢(2) to a few decimal places. 
This may have caused Euler, their colleague at the St. Petersburg Academy, to study 
this problem. In a paper of 1731,' Euler used integration to derive the formula 


St ol 1 
a = ye rae (In2)°. (20.1) 


Euler derived (20.1), or (15.52), as an immediate consequence of (15.53), but his 
derivation of (15.53) was complicated. For this reason, in Chapter 15 we gave Abel’s 
derivation. 

Observe that the series on the right-hand side of (20.1) was evidently much 
more rapidly convergent than the original series for ¢(2), and Euler determined that 
€(2) © 1.644934. But Euler had a result more general than (20.1), namely (15.53), that 
involved the dilogarithmic function defined by the series )°°~ oe Perhaps Euler’s 
work on the dilogarithm led him to apply calculus to the problem of the summation of 


1 B20 § 22. Bu. 1-14 pp. 25-41. 
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the general series )°;_, f (k). The result was a paper he presented to the Academy in 
1732? (published in 1738) in which he briefly mentioned the Euler—Maclaurin formula 
in the form 


n 2 3 
dt d‘t d°t 
t(k) = | tdn+at-4 | td foes, 20.2 
21H / ntat+p—+y—5 +57, (20.2) 
where a, 8, y,... were computed from the equations 
1 4 1 1 1 r 1 1 
a=F5,P= 7 Za- Y= as ; 
2 [oo ioe 1" oan Ped 
1 1 1 1 


6 


io oe Loe Tee 

He gave an application of (20.2) to the summation of the very simple example, 
> {1 (k* + 2k), and then proceeded to discuss other types of series. He wrote a 
longer paper? on the subject three years later, in 1735, in which he explicitly evaluated 
ie k" as polynomials in n for r = 1, 2,...,16. This should have alerted Euler 
to the fact that a, B,y,6,... were closely related to the Bernoulli numbers. Jakob 
Bernoulli defined his numbers in exactly this context, except that in his published work 
he gave the polynomials up to r = 10. Euler was perhaps not aware of Bernoulli’s 
work on sums of powers of integers at this stage. By 1727, he had certainly studied at 
least portions of Bernoulli’s Ars Conjectandi.4 But it was only in 1755° that Euler, by 
then fully aware of Bernoulli’s contributions, followed de Moivre in adopting the term 
Bernoulli numbers. In a highly interesting paper,° written in 1740, that we discuss in 
Chapter 2, Euler explained that the generating function for the numbers a, B, y,... 
was given by 

2 3 ! 
S=1lt+az+Ber+tyr t= 5 5 1 


h i Zz a Zz 
12° 123” 1234 7 T2345 


ie 20.3 

6 at (20.3) 

Euler also offered an explanation for the appearance of the Bernoulli numbers in 
two such very different situations: in the values of ¢(2n),n = 1,2,3,... and in the 
Euler—Maclaurin formula. In rough terms, his explanation was that the generating 
function for both cases was the same. In his 1735 paper, he also computed ¢(n) = 
parat & to fifteen decimal places for n = 2, 3, 4, making use of (20.2). The series 
on the right-hand side of (20.2) in these cases were asymptotic series and Euler 
manipulated them exactly as Stirling and de Moivre had done in a different context, 


2 B25 § 2. Eu. I-14 pp. 42-72. 

3 Bu. I-14 pp. 108-123. E47 § 23. 

4 Calinger (2016) p. 24. 

5 Bu. I-10 p. 335. E 212 Part Il, § 122. 
6 Bu. I-14 pp. 407-462. E 130, § 22. 
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using the first few terms of the asymptotic series, up to the point where the terms 
started getting large. 

Maclaurin’s results related to (20.2) appeared in his influential book Treatise of 
Fluxions. This work was published in 1742 in two volumes, although the first volume, 
containing the statements of the Euler-Maclaurin formula, was already typeset in 
1737. Colin Maclaurin (1698-1746) studied at the University of Glasgow, Scotland, 
but the mathematician who had the greatest formative influence on him was Newton, 
whom he met in 1719. Much of Maclaurin’s work on algebra, calculus, and dynamics 
arose directly from topics on which Newton had published results. Maclaurin was 
professor of mathematics at the University of Edinburgh from 1726 to 1746, having 
been recommended to the position by Newton. Maclaurin was probably inspired to 
discover the Euler—Maclaurin formula by the results of de Moivre and Stirling on the 
asymptotic series for )°;_, Ink. Newton had a result on the sum )77_, aa giving 
the first terms of the Euler—Maclaurin formula for this particular case. Newton gave 
this result in a letter of July 20, 1671, to Collins,’ but Maclaurin was most likely 
unaware of it. 

It is a curious fact that Euler and Maclaurin learned of each other’s works even 
before they were published. This was a result of the brief correspondence between 
Euler and Stirling. In June 1736, Euler wrote Stirling® about his formula and 
mentioned applications to the summation of }°?°., Z and )-y_| i He wrote the latter 
result as 


1 1 1 
14 
2° 3 x 
C4] 1 1 1 1 1 1 691 : 
=C+Inx4 etc. 
‘s 2x 12x2  120x4 = 252x6  240x8 132x109 32760x!2 


(20.4) 


We can see that the value of C, Euler’s constant y, would be 


and Euler gave this value as 0.5772156649015329 in his 1735 paper and in his letter. 
Note that the series on the right-hand side of (20.4), from the fourth term onward, can 
be written as 


Bee 
Qn x2n- 


n=] 


Then in 1737, Stirling received from Maclaurin the galley proofs of some portions 
of the first volume of Maclaurin’s treatise, containing two formulations of the Euler- 
Maclaurin formula. Because of some business preoccupations, Stirling did not reply 


T Newton (1959-60) vol. 1, pp. 68-70. 
8 Tweddle (1988) pp. 141-144. 
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to Euler’s letter until April 1738. He then informed Euler about Maclaurin’s work 
and about his communications with Maclaurin on Euler’s work. Stirling also told 
Euler that Maclaurin had promised to acknowledge Euler’s work in his book. And 
indeed Maclaurin did so.? Concerning this point, Euler wrote a reply to Stirling on 
July 27, 1738:!9 


But in this matter I have very little desire for anything to be detracted from the fame of the 
celebrated Mr Maclaurin since he probably came upon the same theorem for summing series 
before me, and consequently deserved to be named as its first discoverer. For I found that theorem 
about four years ago, at which time I also described its proof and application in greater detail to 
our Academy. 


Unfortunately, Euler uncharacteristically forgot to mention Maclaurin in his differ- 
ential calculus book of 1755, where he discussed this formula. 

Maclaurin presented four formulas and he understood these to be variations of the 
same result.!! In modern notation, two of these can be given as 


n—-1 


a-+n 1 1 
Dfa+h=[ foyds+ 56@- farm) + 5U'@ - flatn) 
k=0 a 


1 m m f \ 1 v v 
— 359 Ff (a)— f (a+n))4 30040 “f @—-f@ 


: a—y+n py 1 : 1 
Drarn= fi penax+ H(4'(a 5) f(a 5+n)) 


2 


7 Wl 1 Wl 1 
- sail! («-5)-y («-5+n)) 

31 ; 1 : io ve 
+ seam "(+ 5) (0 ;+*)) a 


Note that the coefficients appearing after the second term on the right-hand side of 
(20.5) are given by 


n)) tee, 


Bo Ba Bo 
ae 
and the coefficients appearing after the first term on the right-hand side of (20.6) are 
given by 
(2=1)Bs <2 =1)Be (2 = 1B 
ph AYQ> 6i2~° ; 


9 Maclaurin (1742) p. 691, footnote. 
10 Tweddle (1988) p. 146. 
! For the original statement of these formulas, see Maclaurin (1742) articles 352-353, pp. 292-293. For the 
proofs, see articles 828-832, pp. 672-677. 
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The remaining two formulas were for cases in which the series on the left-hand side 
were infinite, and where it was assumed that f (x) and its derivatives tended to zero as 
x —> oo. Maclaurin derived the de Moivre and Stirling forms of the approximations 
for n! by taking f(x) = In x in (20.5) and (20.6), respectively.!* He also applied his 
results to obtain Jakob Bernoulli’s formula for sums of powers of integers as well as 
approximations of ¢(n) for some values of n. In addition, he obtained some formulas 
for approximate integration, such as the three-eighths rule. 

In 1772,!3 Lagrange gave a formal expression for the Taylor series as a basis for 
an interesting derivation of the Euler-Maclaurin summation formula and of some 
extensions involving sums of sums. Suggestive of important analytical applications, 
Lagrange’s formula for the Taylor series was 


h2 
fath) = fx) thf) + af) ++ 


h? D? 
= (1 REDE 4 +) Fe) = el F009 (20.7) 


Here D represents the differential operator £. We discuss this in detail in Chapter 21. 

Clearly, Lagrange charted out a new approach with his algebraic conception of the 
derivative, and yet this algebraic perspective can be traced back to the work of Leibniz. 
Leibniz had been struck by the formal analogy between the differential operator and 
algebraic quantities; Lagrange implemented this idea by going a step further and 
identifying the derivative operator with an algebraic quantity. This formal method was 
used by some French and British mathematicians of the first half of the nineteenth 
century, leading to significant mathematical developments. 

The eighteenth-century mathematicians made very effective use of the Euler— 
Maclaurin formula but did not seem too concerned about the reasons for this 
effectiveness, especially where divergent asymptotic series were involved. Gauss, 
with his interest in rigor, was the first mathematician to express the need for an 
investigation into this question. He did this in his 1813 paper on the gamma function 
and hypergeometric series'* and again in 1816 in a paper on the fundamental 
theorem of algebra. Interestingly, the rigor needed for the careful discussion of 
the Euler—Maclaurin summation formula was provided by the French mathematical 
physicist, S. D. Poisson. 

Siméon Denis Poisson (1781-1840) studied at the Ecole Polytechnique in Paris 
where he came under the influence of Laplace and Lagrange. The latter lectured 
on analytic functions at the Polytechnique, where he introduced the remainder term 
for the Taylor series. In the 1820s, Cauchy lectured at the Polytechnique on the 
application of this remainder term to a rigorous discussion of the power series 
representation of functions. Poisson’s contribution was to derive the remainder term 
for the Euler—Maclaurin series.'> His motivation for this 1826 work was to explain an 


12 ibid. articles 838-854, pp. 678-692. 

!3 Lagrange (1867-1892) vol. 3, pp. 441-476. 
14 Gauss (1813). 

15 Poisson (1826). 
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apparent paradox in Legendre’s use of the Euler—Maclaurin formula to numerically 
evaluate the elliptic integral 


/ "aft = deine ao. (20.8) 
0 


The sum on the left-hand side of (20.5), after a small modification to allow for 
non-integer division points of the interval (a, a + n), can be used to approximate the 
integral on the right-hand side. Now the integrand in (20.8), f (x) = V1 — k? sin’ x, is 
such that its odd order derivatives vanish at 0 and >. Thus, the series on the right-hand 
side of (20.5) vanishes, since it involves only the odd order derivatives. This implies 
the absurd result that the sum on the left-hand side remains unchanged no matter how 
many division points are chosen in the interval. It was to resolve this paradox, rather 
than to explain the effectiveness of the asymptotic series, that Poisson developed the 
remainder term for the Euler—Maclaurin formula. 

Another peculiar feature of Poisson’s work was that he used Fourier series instead 
of Taylor series to find the remainder term. Poisson learned the technique of Fourier 
series from Fourier’s long 1807 paper on heat conduction, on which Poisson wrote a 
brief summary in 1807.'° From 1811 on, Poisson published several papers on Fourier 
series and was very familiar with its techniques. In particular, he applied the result 
now known as the Poisson summation formula (originally due to Cauchy) to a number 
of problems, including the present one. He obtained the remainder after g terms as an 
integral whose integrand had the form 


(> = cos 2am] fP® (x). (20.9) 


n=] 


In his 1826 paper, Poisson also applied the Euler—Maclaurin formula to the 
derivation of a result he attributed to Laplace. Laplace arrived at his result as he 
attempted to approximate an integral by a sum during his study of the variations 
of the elements of the orbit of a comet. It is a remarkable fact that James Gregory 
communicated just this result to Collins in a letter dated November 23, 1670.!7 

Jacobi, who was aware of Poisson’s paper, gave a different derivation of Euler— 
Maclaurin using Taylor series;!® we state the formula he derived in a slightly more 
convenient format: 


n 


‘ 1 
Los i F(x)dx + 5(f(m) + fm) 
k=m mt 

2 Bos 


Sani f2-Dny — f2—Dem)) + Ry (f), (20.10) 
s=l ; 


+ 


16 Poisson (1807). 
!17 Turnbull (1939) pp. 118-122. 
18 Jacobi (1834). 
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where R,(f) is the remainder term: 


RQ) = ol [ Bog (x — [x]) fC? (x) dx. (20.11) 
The Bo, are the Bernoulli numbers; the Bernoulli polynomial B, (7) is defined by 
q 
B,(t) = y ({) Bt. (20.12) 


Note that since B2z4; = 0 for k > 1, only odd order derivatives appear in the sum 
(20.10). It can therefore be shown, applying integration by parts, that changing every 
2q to 2g + 1, while also changing the — to +, does not effect a change in Rg. 

Jacobi defined, but did not name, the even Bernoulli polynomials Bo, (x) and 
gave their generating function. In the 1840s, Raabe gave them the name Bernoulli 
polynomials.!° Jacobi also proved the important result that Bay,42(*) — Bam+42 was 
positive while B4,,(x) — Bam was negative in the interval (0,1). From this he was 
able to give sufficient conditions on f that the remainder term had the same sign 
and the magnitude of at most the first omitted term in the series on the right-hand 
side of (20.5). One set of sufficient conditions was that the sign of f (2m) (x) did not 
change for x > a and that the product f@”) (x) f@”"*? (x) was positive. Since this 
was clearly true for f(x) = In x, Jacobi’s result actually implied that de Moivre’s and 
Stirling’s series were asymptotic. Thus, though Jacobi did not explicitly mention it, he 
had resolved the problem raised by Gauss. 

The papers of Poisson and Jacobi show that the Euler—Maclaurin formula follows 
from the Poisson summation formula. Thus, these two extremely important formulas 
are essentially equivalent. Moreover, by comparing the remainders in the formulas 
of Poisson and Jacobi, we observe that the Bernoulli polynomials B, (x) restricted to 
0 < x < 1 have Fourier series expansions. Surprisingly, Euler and D. Bernoulli were 
aware of this fact and in the 1770s, Euler gave a very interesting derivation of this 
result by starting with a divergent series; for a full discussion of this, see Section 20.5. 

In 1823, the Norwegian mathematician Niels Abel (1802-1829) found another 
summation formula, called the Plana—Abel formula.2° There was little mathematical 
instruction at Abel’s alma mater, University of Christiania, so he independently 
studied the works of Euler, Lagrange, and Laplace. Thus, even before doing his great 
work on algebraic equations and elliptic and Abelian functions, Abel made some 
interesting discoveries as a student. For example, he found an integral representation 
for the Bernoulli numbers, and he substituted this in the Euler—Maclaurin formula 
to write: 


- Ley, [OP @tav-) -o(e-3v-1) at 
Low = f ecar—5o+ [ — a 


19 Raabe (1848). 
20 Abel (1965) vol. I, pp. 11-27, especially p. 23. 
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Interestingly, the Italian astronomer and mathematician Giovanni Plana (1781- 
1864) discovered this result”! three years before Abel. Plana studied with Lagrange at 
the Ecole Polytechnique; both Lagrange and Fourier supported Plana in the course of 
his long and illustrious career. 


20.2 Euler on the Euler—Maclaurin Formula 


Euler’s problem was to sum ya 1¢(k) = S(x), where he assumed that t(x) and 
S(x) were analytic functions for x > 0. Naturally, he did not state such conditions, 
but his calculations imply them. His procedure for solving this problem was almost 
reckless.?* He expanded S(x — 1) as a Taylor series: 


1 1 
Sx — 1) = Sx) — S'(x) 4 55) 575) Pres 


Following his notation except for the factorials, Euler then had 


Hn) = Sn) —Sa—y = Las. Leas? Tats. 20.13) 
BE ON NES = te adn? Shans Akdnt j 


To determine S$ from this equation, he assumed 


dt d*t d>t 
= t + at + { L § free, 20.14 
: ‘ cee PR Y Gre dn PAE 


Next, he substituted this series for S on the right-hand side of (20.13) and equated 
coefficients. He called this a well-known method, probably referring to the method of 
undetermined coefficients. In his first paper on this topic, he merely noted the values 
of a, B, y, 6, --- obtained when this substitution was carried out. In the second paper 
he observed that he got 


dt 1 (dt at Lf d7t d>t 
dn 2! \dn dn? 3! \dn2 dn3 


6 2 2 
The term ft on both sides cancelled, and thus the coefficients of a fs, ... had to 
be zero. This gave him 


1 a | B a 1 
CoRR g ea a Ga 
so that 
1 1 1 
a ms P75! y=yJy, 6= 70° 


21 Plana (1820). 
22 Bu. 1-14 pp. 108-123. E47. 
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Finally, Euler could write the Euler—Maclaurin formula as 


| 1 1 dt 1 dt 1 dt 
S= [| tdn+-—t- } 
2 12dn 720dn3 30240 dn5 


In fact, he calculated the terms up to the fifteenth derivative. 
In later papers and in his 1755 book on the differential calculus, Euler stated this 
formula in terms of Bernoulli numbers:7° 


1 Bo dz Bad*z Be dz 
So= [cae = : Z 


pegs em 20.15 
O° Dae * Bh aee Olde ene) 


Here z was a function of x and Sz represented the sum of a series whose last term 
was z. For example, for the case z = +: 


xn 
1 1 1 
Sz= Fi | aa fee ah (20.16) 
or when z = Inx: 
Sz=Inl+In2+4+---+Inx =Inx!. (20.17) 


Recall from Chapter 18 that de Moivre gave an asymptotic formula for In x!; in his 
differential calculus,”+ Euler derived this formula from (20.15) by taking z = Inx, 
and C as the constant of integration, obtaining 

By 1 By 1 Be 1 


1 
Inx!=C+xInx-x4 5 nx ace ae Reus Fes». (20.18) 


He next set x = | to find 


Bo Ba Bo Bg 
e 1-2 3-4 5-6 7-8 , ed 


a series he knew to be divergent. In fact, he remarked that this excessively divergent 
series was unfit to be used to find an approximate value of C. So Euler employed 
Wallis’s formula, as had de Moivre, to calculate C. Recall Wallis’s formula 


m  2-2:4-4-6-6-8-8- etc. 
0 Teh Bea be 70s Ste. 


to apply this, Euler started by taking x = oo in (20.18) to obtain 


x 
> Ink=C (x 5) Inx — x, (20.20) 
k=1 

2x l 
yo Ink=C+ (2: + 5) In2x — 2x. (20.21) 
k=1 


23 Bu. I- 10) § 140. E 212. 
24 ibid. § 157—159 


20.2 Euler on the Euler—Maclaurin Formula 565 


He then added In2 to each term on the left-hand side of (20.20) and x In2 to the 
right-hand side to write 


x 

1 
y> In@k) =C 4 (: ;) ina xIn2—x. (20.22) 
kel. 


Subtracting (20.22) from (20.21), he arrived at 


Y> In@k — 1) = xInx 4 (: 


k=1 


1 
5) In2— x. (20.23) 


Still taking x to be “infinitely” large, Euler took the logarithm of each side of 
Wallis’s formula and then applied (20.22) and (20.23) to find that 


x x 
In s = 27 In@k) — In@x) — 2 )* Ink — 1) 
k=1 k=1 
=2C4+ (2x + 1)Inx + 2x In2 — 2x — In2 — 2x Inx — (2x + 2)In24+ 2x 
=2C —21n2 
or 
C= thon 
=—-In . 
5) IT 


Euler then gave the decimal value of 5 In(2zr) as 
0.91893853320467274 17803297, 


and wrote that the value of the divergent series (20.19) was 5 In(27). Recall that Euler 
was of the view that divergent series had a sum; he believed that, depending on the 
type of series, certain divergent series could be used to find the approximate values of 
the functions being expanded and some could not be so used. 

In Chapter 18, we saw that de Moivre conjectured the asymptotic formula for 
c>l 


[o,@) CO 
1 1 1 B 1)--- 2k —2 
oe ee a Oe 2028) 
k=0 aise k=0 ss 
he also conjectured that 
! | ! | ! | | ! 
no need A ea 
a 1 Bo Bg 1 Bo Ba 
vn On Ine" 4nA 2a 2a2 4at ( ) 


Now de Moivre may have obtained (20.25) by taking the term-by-term derivative 
of the asymptotic series for In x! and then obtained (20.24) by further differentiation. 
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Although de Moivre did not give proofs for these results, Stirling specifically noted 
the case c = 2:*° 


CO CO 

1 Pit Bo 
> (Mth on 2n2' » nrk+1? ey) 
k=0 k=1 


Euler gave a proof of this case,”° taking z = 5, in (20.15) to obtain 


1 1 1 1 Bo Bs Bo Bg Bio 
22 x2 x 2x2 x3 x x7 x? xl 


(20.27) 


Note that C = )°2, 4 so that (20.27) is equivalent to (20.26). To find an 


approximate value for C = a Euler set x = 10 in (20.27); he calculated 


10 
1 
) Ba 1.549767731166540690. 
k=1 


He then took ten terms of the series on the right-hand side of (20.27) and obtained 


2 
C= - = 1.644934066848226430. 


Note that although (20.27) was a divergent series, it yielded good approximations 
to the value of C because of its asymptotic character. Although Euler had not fully 
fathomed the nature of such a series, to be revealed later by the work of Poisson and 
Jacobi, his mathematical experience and intuition allowed him to utilize it effectively 
for correct results. 


20.3 Maclaurin’s Derivation of the Euler—Maclaurin Formula 


Maclaurin’s proof of the Euler-Maclaurin formula is similar to that of Euler, though 
the procedure appears to be more rigorous. Maclaurin described his results in geo- 
metric terms, but his arguments were mostly analytic. However, to enter Maclaurin’s 
geometric mode of thought, we start with his proof of the integral test, usually 
attributed to Cauchy, who proved it in his lectures of 1828. The Euler—Maclaurin 
formula may be viewed as a refinement of the integral test. 


25 Tweddle (1988) pp. 15-16. 
26 ibid. § 148-49. 
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Figure 20.1 Maclaurin’s geometric statement of his formula. 


Referring to Figure 20.1, in section 350 of his treatise, Maclaurin wrote:7’ 


Let the terms of any progression be represented by the perpendiculars AF, BE,CK, HL, &c. 
that stand upon the base AD at equal distances; and let PN be any ordinate of the curve F Ne 
that passes through the extremities of those perpendiculars. Suppose AP to be produced; and 
according as the area APN F has a limit which it never amounts to, or may be produced till 
it exceed any give space, there is a limit which the sum of the progression never amounts 
to, or it may be continued till its sum exceed any given number. For let the rectangles 
FB,EC,KH,LI,&c. be completed, and, the area AP N F being continued over the same base, 
it is always less than the sum of all those rectangles, but greater than the sum of all the rectangles 
after the first. Therefore the area APN F and the sum of those rectangles either both have limits, 
or both have none; and it is obvious, that the same is to be said of the sum of the ordinates 
AF,BE,CK,HL,&c. and of the sum of the terms of the progression that are represented 
by them. 


Maclaurin’s derivation of the Euler—Maclaurin formula followed a slightly less 
dangerous path than Euler’s. His initial description was geometric, but once he had 
defined his terms with the help of a picture, his argument was analytic. The Maclaurin 
series for f(x) = fo y(t) dt was 


2 ) 
f(x) = x90) + Fy O + Fy" O +o. 


Thus, 
[ ver=y0 ~ y/(0) ~y"(0) * y"(0) pes 
0 ' 2! ' 3! ' A! _——— 
or 
1 
(0) = | ydx * y/(0) (0)  y"(0) ve (20.28) 
A 2! af 4! : 
27 


Maclaurin (1742) pp. 289-290. 
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Figure 20.2 Maclaurin’s representation of E-M formula. 
Similarly 
(0) = : "dx 1 (0) 1 aria) 1 iv) — 
aan | ay? cid 4)” 
"(0) = a, l mg ling 
y (0) = dae Oa aes 


1 
1. 
y”(0) = y” dx — —y"(O)—-:: : 
. 2! 


Maclaurin used these equations to eliminate y’(0), y’(0),... in (20.28), obtaining 


o=f d ae 2 ne ‘Wa =n OY dx + (20.29) 
ae eae cease) ie ea 00 | ls ; . 


Thus, he obtained another form of the Euler—Maclaurin formula: 


y(0)+ yQ) +---+ y@—1) 
n 1 n 1 n 1 n . 
= ydx— >| ydx +5 ; amr ; yrdx+-.-, (20.30) 


Maclaurin also explained how the coefficients were obtained. The reader should 
now have little trouble in following Maclaurin’s proof of (20.29) from section 828 of 
his book, while referring to Figure 20.2. 


Suppose the base AP = z, the ordinate PM = y, and, the base being supposed to flow uniformly, 
let z = 1. Let the first ordinate AF be represented by a, AB = 1, and the area ABEF = A. 
As A is the area generated by the ordinate y, so let B,C, D, E, F,&c. represent the areas upon 


the same base AB generated by the respective ordinates y,y,¥, ¥,&c. Then AF = a = 
A f + G oa + 0 &e. For, by art. 752,A=at+ g4 4 ++ 150 + &c, whence we have 


: a a4 a4 a a : 4— B—-2-4_ 4 _ a= 
the equation (Q)a = A I~ 6724713 &c. In like manner, a = B I~ 6 24 &c.d = 


C- + - > —&c.a = D- oe &c.a@ = E —&c. by which latter equations, if we exterminate 


28 Maclaurin (1742) pp. 672-673. 
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a,a, a, a’, &c. from the value of a in the equation Q, we find that a = A — 8 + & - Aa + &c. 


The coefficients are continued thus: let k,/,m,n,&c. denote the respective coefficients of 
a,a, a, &c. in the equation Q; that is, let k = jel = i, m= Ty kes suppose K = k = a 
L=kK-1= §.M=kL—-~KK+m=0,N =kM—IL4+mK —n = —z, and so 
on; thena = A— KB+ LC —-MD+NE — &c. where the coefficients of the alternate areas 


D, F, H, &c. vanish. 


20.4 Poisson’s Remainder Term 


Poisson entitled his 1826 paper on the Euler—Maclaurin formula “Sur le calcul 
numérique des intégrales définies.”*? This paper applied Fourier series, and the 
Poisson summation formula in particular, to a variety of problems. Poisson began with 
a brief sketch of the proof, that he had given in an earlier paper, that the Abel means of 
a Fourier series of a given function converged to the function at a point of continuity. 
Note that the definition of an Abel mean is given in Chapter 32. Since the Abel mean 
of a convergent series converges to the same value as the series, Poisson mistakenly 
assumed that his proof sufficed to show that the Fourier series of a function converged 
to the function. Poisson used his result on Fourier series to prove the Euler—Maclaurin 
formula, undermining the proof. However, the work of Fejér some seventy years later 
filled in the gap and rescued Poisson’s proof and its expression for the remainder term. 
Fejér’s proof is discussed in Section 31.2. 

Poisson started his proof of the Euler—Maclaurin formula by partitioning the 
interval [—a,a] into 2n equal parts with a = nw. The partition points were given, 
then, by 


—nN@o<—-nNwO+o<::-<0<o<::-<nao. 


He applied the trapezoidal rule, given in our Section 9.7, on each subinterval to a 
function f, a function he implicitly assumed to be differentiable 2m times. Thus 


—not(k+l)o wo 
/ f(t)dt & Ck no+ko)+ f(-not+ (k+1)0)), 


not+ko 
a 2n—1 —naot(k+l)@ 
fo= >, f f (pdt 


—a k=0 not+ko 


where 


Pa = 5S no)+ f(—n+ w)+ f(-—nw+2o)4+-:: 


t f(nw—20)+ finw— o)4 5 fino). (20.31) 


29 Poisson (1826). 
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He then gave the equation 


f(t)dt = @Py + Qn, (20.32) 


—a 


where Q, denoted the error in the approximation wP, for the integral. To find an 
expression for Q,, though he did not refer to Fourier, he employed Fourier’s formula 


(19.53) rescaled to the interval (—a,a): 


f@= xf f@dt+— > SEE ay (20.33) 

~4 k=] 
when —a < x < a. When x = +a, the left-hand side was replaced by 
5 (f(a) + f(-a)). He took x =no,(n— l)a,...,0, —@, —20,..., —(n— lw 


successively in (20.33) and added to get 


Pras fe fioar+— fr e a EOS Da (20.34) 


j=—(n-1) k=1 


It is easy to check that the inner sum in (20.34), after changing the order of 
summation, would be 


és j _ ier cos 24 Amt when k = 2nl, 


no — ) otherwise. 


j=—(n-1) 


So he obtained, with a = na, 


"fat se) ys cos at pat 


~4 j=] 


oP, 
— * £0) dt — Qn. (20.35) 


To find another expression for Q;,, Poisson applied integration by parts repeatedly 


to get 
- 2Int 4 2Int 
/ cos =" Fn) ar = = sin Fol, / ’ sin S Dt at 
21 = nl ° 
DRE 5 wd @ gy 2lat ., 
= Age2ie COS as (t) ee -f 4n2i2 cos te (t) dt 


7 2 744 21 
= f'(a) — f'( a)) zs : 72 £08 fade. 


= pe 4x? 


—a 
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571 
Thus, after m repetitions of this process 
ae aa ie pV@-fCa) +o, al a (F(a) — f"—a)) 
See 6 (f° @) — fa) + 
20" (2m—1) (2m—1) 
Y Omym d om (f (a)—f (—a)) + 8Rn (20.36) 
where 
= —2(-1)” oy 3 ome os at FO") (#) dt. (20.37) 
~@ f=1 


Now to determine )° am Poisson took f(x) = e* to obtain 


a 


f(@dt =e" —e“* 


—a 


FEV eq) — fI-V a) = et - 4. 


He then let n = 1, so that setting w = a in (20.35) and (20.31) gave him 


Substituting these values in (20.35) and (20.36) and dividing by e“ — e~“, Poisson 
obtained 


1 1 
a e24+e 24 


14 so - 1 2a® 3 1 
2 eie_e-te | QrP SP Qn) Se 4 Qn)? & 


(20.38) 
Now note that the left-hand side of (20.38) can be rewritten as 


a ence. 8 ft 2 _ 4 ee 
9 gee] DO, ef —] ef , 


though Poisson does not refer to it as such, by (2.36), this represents the generating 
function for the even Bernoulli numbers: 


a oO a Pe Be 


apt a tae oe (20.39) 
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Equating the coefficients in the powers of a in (20.38) and (20.39), Poisson 
succeeded in rederiving Euler’s formula (2.35): 


1 1 (21-1 B30 


14 or a eal ; (20.40) 


By substituting these values in (20.36), Poisson had 


B B. 
a 5 (f(a) f'(—a))w* 4 f(a) ~ fa)" 
Bom 


oan D(a) — fC") (—a))o"™ + Rn. (20.41) 


Substituting the expression for Q, from (20.41) into (20.32), with Ry» given by 
(20.37), Poisson obtained the Euler—Maclaurin formula: 


n-1 
" f(Odt = S> fko)+= s(FC a) + f(a)) 
o k=—n+1 
+>) ahaa Dq) — FF Viea)o* a Ra 


Poisson also observed that the term in the integrand of (20.37), given by 
= 1 2 xt 
d pm 


is less in absolute value than 


Thus, if f 2m) (¢) does not change sign in the interval [—a,a], we may write the 
absolute value of the integral (20.37) as: 


Ve 2n It 
(2m) 
() of 3 pa © eo) ee 


@ j=1 


< 2 |Bom| |f"-P(a) — £2"-Y(—a)|, (20.42) 
< Gay 

Thus, the right-hand side of (20.42) gives the absolute value of the last term before 
Rm in (20.36); under the given condition on the function, the error R,, was less 
in absolute value than the last term of the series Q,, explaining why the first few 
terms of the series for In (x), although it diverged, yielded a good approximation of 
the function. Recall from Section 18.10 that Cauchy gave a simpler but less general 
derivation of this result seventeen years after Poisson. 
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20.5 Jacobi’s Remainder Term 


The first rigorous treatment of the Euler-Maclaurin summation formula was given 
by Jacobi in a paper dated June 2, 1834.39 He employed the Taylor series with 
remainder, a topic treated rigorously by Cauchy in the 1820s in his lectures at the 
Ecole Polytechnique; in this connection, see our Section 11.9. In his paper, Jacobi 
obtained the Euler—Maclaurin formula with remainder for positive integers a and x in 
the form 


Y f= fo roars 5(F@ + Feo) 
k=a A 


B 
+0 a — f° -D(a@) + Rom+2 (20.43) 
G1 
where 
—] x _ 
Rom42(f) = GmaD! / Bom+2(t — [1) fOM4D 4) de, (20.44) 


Note that B,, (x) represents the nth Bernoulli polynomial, defined by the generating 
function 


tex! 0° tt 
Fae ed, 
n= 


and B, denotes the nth Bernoulli number, defined by the value of the nth Bernoulli 
polynomial at x = 0, that is, B, = B, (0). For a discussion of the generating function 
of the Bernoulli polynomials, see Section 2.10. Jacobi used the generating function 
for Bernoulli numbers to show that for m > 1 


1 ey eee By | Bam 

Qm+D! | (Qm)! | 2!Qm—D! | 4! Q@m—3)! | 'Qm)l 
(20.45) 

1 a 1 Bo 1 Ba 1 | Bm _ 4 

(Qm+2)!'  Qm+1)! | 2! Qm)! | 4! Qm—2)! | ' (Qm)!2! 
(20.46) 

two relations that, for m > 1, are equivalent to: 

Bn+1(1) — Bm41 = 0, (20.47) 


an equation we proved in Section 2.10. 
Observe that for the case in which m + 1 is odd, so that By+1 = 0, equation (20.47) 
is equal to (20.45); otherwise (20.47) is equal to (20.46). Also note that equations 


30 Jacobi (1834). 
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(20.45) and (20.46) are identical with those employed by Seki and Bernoulli to define 
Bernoulli numbers. 
Jacobi proceeded to define the functions 


W(x) = [ f(dt and W(x)—Wae—-h)=O(x), h>O. (20.48) 


Taking x — a to be an integer multiple of h, he denoted 


D5 Ol) = O@ +h) + O(a + 2h) +--+ + O(x) = O(x) — O(a) = O(a). 
; (20.49) 


Now since O(x) = ®(x) — ®(x — h), Taylor’s theorem with integral remainder 
gave 


! " h? n—-1 p(n) h” 
OG) = Ph 2.) Pier ED Pr 


h _ 4\n 
te 0" f BAD" pM de. (20.50) 
0 nN. 


Because ®/(x) = f(x), Jacobi had ®t) (x) = f(x); then taking 


h _ 4\n 
In = / GAO" Ge — dt, 
0 n! 


(20.50) could be rewritten as 


/ h? " h° (n—1) ¢(n—1) h” n 
OO) =f/Ora7@), ty’ @ = EDEY May els: 
(20.51) 


The next steps Jacobi delineated were to replace n by n — 1 in (20.51); take its 
derivative and multiply that derivative by sh; then change n ton—2,nton—4,...,n 
to n—2m; correspondingly take the second derivative and multiply it by 2p, take the 
Bans, ..., and, finally, take the (2m)th derivative 
and multiply it by mah Though he did not write it out, note that the result of this 
process is 


fourth derivative and multiply it by 


1 id / h? vl he 
5 ne Og ag ee 
+ (-1)" f@ Yq) ua + ( prt y 1 (20.52) 
(n—1)!2 ° ial ‘ 
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a 2 exh? = fx uae 
ah” 12 Boh? 


+ (-1)" 1 FY) Z + (-1) 
(n—2)! 2! ° 2! 


In-2~— (20.53) 


Unlike Euler, Jacobi omitted many details in his explications, since by that time 
many received and known results could be assumed. Expanding, then, Jacobi’s terse 
argument, note that when (20.50), (20.51), (20.52), (20.53), and so on were added, the 
sum would be 


Lis By 3, 24 iv 4 Bom (2m) 2m 
Ox) +5 O'@Mh+ = 0 (x)h* + ri * @?(x)nt +- + Onl ag (x)h 


h 
= f(x)h+ / Tn FO" (x —t)dt (20.54) 
0 


where, since By = —5, 
bho (h _ p)2mt2 (h pas t)2mtlp (h = t)2"h2 ee (h _ t)2n2m 
nS Oy Ao Omni ny a 
p2mt2 t 


Observe that the terms involving f’/(x)h?, f’(x)h3, ... vanish. For example, the 
coefficient of f’(x)h> in the sum (20.54) is 


1 4 By 4. Bo 
Se ot 
that becomes zero when we take m = | in (20.45). Next note that the coefficient of 
ptt (x)h* is 
1 By Bb Ba 


ALS BU oe Doe, Ae 


a sum that is zero by (20.46). Thus, following this pattern, all the powers of h up to h2” 
reduce to zero, as Jacobi mentions, explaining (20.54) and (20.55). How did Jacobi 
perceive, after such a complicated calculation, that the terms in (20.54) would vanish? 
It would appear that the clue might have been that, in terms of symbolic calculus 
discussed in our Chapter 21, 


O(x) = W(x) — We —h) 
= w(x) —e"? w(x). 
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Thus 
O(x) = 1 —e"?)p 
or 
wis eh? Bet Me NDE 
~ ehD_1~ " fp ehD_ 1’ 
(eee 
6a t Pn gin) pn 
=O SO h 
n=0 
or 


1 ! By 142 
EP WO Obi OMe ass (20.56) 


Jacobi surely saw that the sum on the right-hand side of (20.56), up to 2m terms, 
was identical with the sum on the left-hand side of (20.54) up to 2m terms, so 
that the terms would cancel, leaving hf. Jacobi thus stated the E-M formula with 
remainder: 


“/(8@) 1.,... " 
P| ; +58'@)4 9 (x)h +-- = i ) 
_ FIO) ! ” Bom (2m) 2m—1 
=I (4 Sf SP Ont ‘+ ol if (t)h ja 
x: h oe: 
= (Gs / fo FOOD) at, (20.57) 
a 0 a 


with 7,, defined by (20.55). Thus, (20.57) gives Jacobi’s Euler—Maclaurin formula for 
the case in which x and a are not necessarily integers, from which we can derive the 
modern form of the Euler—Maclaurin formula, (20.43) and (20.44). 

To derive the modern form from Jacobi’s result, take a and x to be positive integers 
and take h = 1. Then 


[ 1 / | a 
/ (ro thos xf (t) 4 tee yi 


= 720) at 


-[ f (dt 4 5(Fe) f@)4 Fe) f'(@) + 


, Bam 


+ my! aT YO) - tail) (20.58) 
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Moreover, with u = 1 —f, 


1 x 
/ Bp Owen ad 
0 a 


1 1 
~ on | (Bom42(1 — t) — Bom42)(f2"*? (a +1 —1) 
+ fem Iaq4+2—-1)+---)dt 
1 1 
~ eo (Bom+2W) = Bom+2) 


(fC atu) + fC" at 1tu)t---)du. (20.59) 
Now with s a positive integer such that a + s + u = v, we have 
1 
[Benson fats +0) du 
0 


a+s+l1 
= / Bom42(v — (a+s)) FO"? (v) dv 
a+s 


and sincea+s <v<a-+s-+1,so that |v] =a+s, we may write 


a+s+l 
=f Banaalv— Lvl) £0") dv. 
a+s 


Denoting Bom42(v — Lv]) — Bom+2 by Bym+2(v), the right-hand side of (20.59) 
may be rewritten 


1 atl (2m-+2) at+2 ee 
———__— B wp ew av+ | B Df Va 
(2m +2)! (/ 2m+2( Ff (v) A 2m+2( Ig (v) 


eee. / : Big f eO@) av) 


1 x 
= _ _ (2m+2) 
= Gay | (Bemiato— LW) — Bama) £2" (0) do. (20.60) 
By taking (20.57), (20.58), and (20.60) together, we obtain (20.43) and (20.44), the 
modern form of the Euler—Maclaurin formula with remainder term. 
In his 1834 paper, Jacobi also proved the theorem: For 0 < x < 1, the polynomial 
Bom(x) — Bom is always positive when m is odd and always negative when m is even. 
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To understand Jacobi’s proof, recall from Chapter 2 the generating function for 
2(Bam(x) = Bom) = 2B rn (x): 


(eY | Hen) =: (<= SS] (20.61) 
T = t ; . 


ef — 1 l1—et 1-e l-—e7 


Jacobi set x’ = 1 — x and rewrote (20.61) as 


(ES : a 


[oe Lae l-e! 


xt xt x't x't 
y ence 2)(e2 —e 2) 


(20.62) 


Jacobi next employed Euler’s product formula (15.18) to express (20.62), the 
generating function of 2B2,,(x), as an infinite product, obtaining 


242 ; x72 
oD (1 intr) ( | et) = i 
—1?xx =2)) Bye”, (20.63) 
(1+ ae) 
n=l Ann n=1 
He next set y = — C x, and observed that the nth term in the infinite product on 


the left-hand side of (20.63) could be written as 


CSE ESA 255th oof oy ee SEY: 


bey Lay 


= 14 2xx'y+xx'(2+ xx’) 
I-y 


=14+2xx'y+xx'(2 xx')y? d+y+ y? +--+). (20.64) 


In view of the given condition that 0 < x < 1, so that 0 < x’ < 1, Jacobi could 
conclude that the coefficients of the powers of y (or powers of —t7) were all positive. 
Therefore, in the infinite product on the left-hand side of (20.63), the coefficients of 
the odd powers of —y (or t*) were necessarily negative for 0 < x < 1. This fact in 
turn implied that B2,,(x) in (20.63) was negative with m odd and positive for m even. 
And this completed the proof of his theorem. 

From this theorem, Jacobi deduced that if f°” (t) and f@”*?)(t) had the same 
sign in the interval (a,x), then 


/ Bom4o(t — Lt) fO"*?@) dt = 0(-1)" Bomaa(fOmt? (x) — fO"™tY(@), 


where 0 < 6 < 1. This implied that the error incurred in taking m + 1 terms of the 
Euler—Maclaurin series must be of the same order as the term just preceding the first 
neglected term. 
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20.6 Bernoulli Polynomials 


As we have seen, the expressions found by Jacobi and Poisson for the remainder term 
in the Euler—Maclaurin formula took quite different forms. When these expressions 
are equated, we obtain the trigonometric expansions of the Bernoulli polynomials: 


cos 27 It 


IU 


Bom (t — [t]) = 2(-1)""" (2m)! )> 


1=0 


and taking the derivative, as we showed in equation (16.126), gives us 


[e,2) . 
- #) sin 27 It 
Bom—1(t — [4]) = 2-1)" 2m — D! d Opn Met Ost) 
(20.66) 
an equation that holds for m = 1 where 0 < t < 1. Note that the series on the 


right-hand sides of equations (20.65) and (20.66) can be shown to be the Fourier series 
of the functions on the left-hand sides. 

Recall that we saw equations (20.65) and (20.66) in Chapter 16, connected with 
the 1809 work of Spence; Euler found results for B,(x — [x]), n = 1,2,..., 
equivalent to the initial cases of (20.65) and (20.66) in his 1753 paper, “Subsidium 


calculi sinuum.”?! By setting x = ae!® in the equation x + x7 +--- = a 
he got 
(cos +i sin $) + a(cos 26 + i sin29) ee 
cos sin cos sin fee. — ; 
: 7 : l—ae'? 1—2acos¢+a? 
(20.67) 
On taking the real part of (20.67), he obtained 
Ape tity fe cs =e 20.68 
cos @ + acos 2 + a“ cos OCS FT acatpae (20.68) 
he then set a = —1 to arrive at 
1 
cos @ — cos2¢ + cos3¢ — +--+ = 5" (20.69) 


Of course, —1 was not a legitimate value of a, because (20.68) is correct only 
for |a| < 1. Euler had a knack for handling divergent series so as to obtain useful 
results; in fact, his method here was later legitimized by the idea of an Abel mean, a 
concept discussed in Chapter 32. Note that 5 is the Abel mean if the series cos ¢ — 
cos 26 +--+; we also mention here that the Abel mean of the series )°?° 9 dy is given 
by lim,_, )-. 


31 Bu. 1-14 pp. 542-584. E 246. 
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Euler integrated equation (20.69) to obtain** 


: sin2d  sin3¢ 1) 
ST, 20.70 
sin @ 5 3 5 ( ) 
where the constant of integration can be seen to be zero. Integrating again, he 


arrived at 


cos2@  cos3¢ m* 
ig ; 20.71 
oe Pp 32 2 4 ena 
since the constant of integration was 
Is aa 
C=1 oe, 20.72 
pa 12 ( 


Now observe that if we set 6 = a — @ in (20.70) and (20.71), we will get 
the series for By (4) and Bo (+), series that hold for 0 < a < l. Ina 
1773 paper, “Theoria Elementaria serierum,”>? Daniel Bernoulli started with (20.68), 
took a = 1, and repeatedly integrated to obtain the trigonometric series for 
By, ( x) , Bo ( ~) pet BE ( x). A year later, Euler followed up with a paper, “Nova 
methodus quantitates integrales determinandi,”*+ dealing with the integral and series 


expressions for Byy, (4). In section 40 of his paper, he integrated the formula 


cos @ + cos2¢+ cos3¢+-+---= 


to arrive at 


‘aes sin 2¢ sin 26 “Gf EA (20.73) 
sin T 2 T 3 T = 2 . ° 


He observed that he could not determine the value of A by setting ¢ = 0, reasoning 
that when @ was small, then Sinn = ®@, so that the left-hand side of (20.73) became 
¢+¢+¢+---, and this could not equal the right-hand side. However, he noted that 
he could set 6 = zr, because when ¢ = z + @, where w was an infinitesimal, he got 
the left-hand side as w -w+@—w+---. Taking ¢ = m gave him0 = A — 3 or 
A = 4, so that (20.73) could be written as 


= 5, 
in 2 in3 a 
Si OOM ts ces e 0<¢ <2nz; (20.74) 
2 3 2, 
after integration, Euler obtained 
_ cos2p  cos3¢ | uA a) 
os ¢ +4 nr?) Pe ae Prnae s (20.75) 


32 ibid. § 53. 
33 Bernoulli (1982-1996) vol. 2, pp. 119-134, especially pp. 120-121. 
34 Bu. 1-17 pp. 421-457. E 464. 
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He set ¢ = 0 and ¢ = z in (20.75) to get the two equations 


2 


ee) [ee) 
1 (-1)" a 
d ne = B and ) 7) = B 4 : 


n=1 
adding, he obtained 

oo 2 

1 
2: => ’ 

d (2n)2 4 
bank me 
= = =5B= = 2B -— — 

2 
ge 
6 


Thus 


ee cre = im a 1 


Another integration produced 


se sinnd mp xo? o 


and yet another integration yielded 


3 cosnd aa mp* ro gt 


n=l 


Euler found the constant by evaluating ¢ = 0 and ¢@ = 


581 


(20.76) 


7 as in the derivation 


of (20.76); he noted that this method of calculating the constants was contained in 
Bernoulli’s 1773 paper. Euler calculated the next two cases in a similar manner: 


yea mo o* © 
nm 90 36 48 240’ 


n=l 


Sow _ nm® gtd? not 2 1 ¢° 


n& 945 902 ° +4624 2120 2720 


n=1 


(20.77) 


Apparently, Euler did not observe the connection between polynomials such as the 
right-hand side of equation (20.77) and the polynomials obtained by Jakob Bernoulli 
when he expressed the sum 1* +2" +...+(n—1)* asa polynomial in n. For example, 


Bernoulli’s table for sums of powers, given in our Chapter 2, has 
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Now if we set @ = 27 in (20.77), we have 


[o,@) 
2(6!) > cos Lan ey 4 a Fa O<t<1; (20.78) 
n=1 
had Euler written (20.78), rather than (20.77), he might have noticed this relation- 
ship. As it is, both Euler and Daniel Bernoulli succeeded in deriving the trigonometric 
series expansions for the Bernoulli polynomials. 
The main purpose of Euler’s 1774 “Nova methodus”?> paper was to express the 
Bernoulli polynomials as integrals. Euler denoted by P and Q the real and imaginary 
parts of equation (20.67): 


a) 
P =acos¢ +a’ cos2¢ + a> cos 3¢ = EOSnES ; (20.79) 
1—2acos¢@ +a? 


Q =asing +a? sin2¢ +a? sin3¢ +--- = asing (20.80) 
1—2acos@¢ + a? 


In addition he defined: 


Py = —_ da, “Fos ?) 
0 a 


n= Qn ag Oo=Q. 
0 a 


Actually, Euler wrote out only the series for P}, Q1, Po, Q2, P3, Q3, P4, Qa; he did not 
define the general case. Thus, he had: 


acos@ a’ cos 2 a> cos3¢ 


P = T T rit es 
1(@) i 2 3 

asing a’ sin2o a sin3¢ 

O1(a) = 1 T 5) T 3 Pete's 
acosg | a’ cos 2 a> cos3o ; 

P (a) = 2 { 72 } 7x povee y 
asing a’ sin2o ; a’ sin3o 

Q2(a) = 2 T 2 T 32 are 

and so on; observe that if we set a = 1 in these series, then Q)(1) is essentially 


By (4), P (1) is essentially Bz (4). and so on. Euler observed that 


35 ibid, especially sections 32-39. 
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1 1 1 
P P 
pay = f NG) Peay -{ ORE 
0 a 0 0 a 


1 
P 
--| @) Inada 
0 a 


and similarly 


1 
Ox(i= -| LO ede 
0 


and, in general, 


_4yjn-1 1 
pie [ Oana" Tay 


(m1)! 
UP ft 20) ay gy 
Onl) = af & da. 


Euler’s formulas for P,(1) and Q,(1) can also be expressed in terms of Bernoulli 
polynomials. 


' (cos2a@ — a)(Ina)*""! da 
1 — 2a cos2m@ + a? 


' (sin 27 — a)(na)*" da 
1 — 2a cos2m¢ + a? 


> 


Bon (d) = cont [ 
0 


Bon+1(@) = (—1)" 


Now Euler obtained (20.74) by integrating a divergent series. We may ask, then, 
did he have another method of obtaining that result? Indeed, Euler himself noted that 
by term by term integration of (20.80), 


sing sin2d@  sin3¢ 


I= 
Q1(1) i 5 3 
i asing da asin d : 
= = arctan ———_—_. 
9 1—2acos¢+a? 1 —acos ¢ |g 
lige nO oa ete eee O0<@ <2nz. 
1—cos¢ 2 2 


Of course, the term-by-term integration is the most problematic step to justify. 
However, since a power series is involved, one does not require the concept of uniform 
convergence for this step.*° We note that later mathematicians, such as Abel and 
Dedekind, dealt with exactly these formulas of Euler and their convergence problems. 
In addition to this approach, Euler offered yet another method, perhaps more easily 


36 See, for example, Nevanlinna and Paatero (2007) pp. 103-105. 
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justified, in his 1773 paper “Summatio progressionum.”*” In example 2, contained in 


the final portion of the paper, Euler considered the formula 
2 3 


Zz Zz 
=loe(i— 2S epee = ae, 
og(1 — z) alee dana as 


in which he set z = xe’? and z = xe~’® to arrive at 


as | | 
2V=T (sing +4? SP 4...) = log(1 — xe!%) + log(1 — xe7i#) 


1—xcos@+ixcos¢ 


= log - 
1—xcos¢—ixcos¢ 
= 2,/—1 arctan eee o4#0. 
1—xcos@¢ 


Taking x = 1 produced the result. To justify this step, we may apply Abel’s 
continuity theorem, discussed in Chapter 4. First we must verify the convergence of 
the series 


el? 4 REN Oe | ge OO (20.81) 


For this purpose, we apply a procedure due to Dirichlet, based on Abel’s method 
of summation by parts. In fact, in a paper of 1877,°8 Dedekind applied exactly this 
procedure to prove that this series and more general series converged. Let s, = e!? + 
ei? +... + ei and let s denote the series (20.81). Then 


St, $2751, 93752 | 


1 2 3 


= 1 1 1 1 1 1 30.82 
=si( 3) +0(5 3) +9 (5 i) te ( - ) 


el(1— ein?) —(1 — e!#) (1 — ein) 
Sy = - — 
1-e¢ 2—2cos¢ 


Observe that 


Thus, for a fixed ¢ 4 0, |s,| < .. It then follows from (20.82) that S converges 
sin z 


absolutely, since 


ni(-f)m(§-$) 4a (-2) + 


37 Bu. 1-15 pp. 168-184. E 447. 
38 Dedekind (1877). 
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Thus, the series s converges when 0 < ¢ < 27. Now Abel’s continuity theorem 
states that if the series a; + az + a3+--- converges to A, then 


lim (a,x + aox* + a3x>+-- ‘=A; 
x17 
for more on this theorem, see Section 4.5. 
: ing 
Taking a, = <—, we have 
xei? =x 2e2iP 331 


a ee 1... =log(l—xe'*®), 0<@ <2n, O<x<1. 


Since log(1 — xe?) O< @ < 2m, is a continuous function in x, it follows that 
lim log(1 — xe'®) = log(1 — e!®). 
x17 


Moreover, Abel’s theorem implies that 


: 202i ; 2b 
lim (xe'# 425 a a ore 


x17 


This completes the proof that 


2ib 3ip 
el? 4 S ! S Evie lost e'9), 
Finally, we note that equation (20.78) and a similar equation involving the sine 
series can be generalized:*? 
°. cos 27 nt 
2-1)" Qk)! Daye = Balt), KEL Ost (20.83) 
n=1 
and 
sin 27 nt 
2(-1)*(2k — 1)! > Cale By_i(t), k>1, O<t <1, (20.84) 


n=1 


results apparently first stated and proved by Raabe, who recognized that the poly- 
nomials on the right-hand sides of equations (20.83) and (20.84) were Bernoulli 
polynomials; these equations in turn imply 


B,(t) = (-1)" Bal — 4) 
and 
| Bon (t)| < | Bon (0)| i | Bon|. 


39 Raabe (1850). 
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20.7 Number Theoretic Properties of Bernoulli Numbers 


Recall that Bernoulli numbers, B2;, are rational numbers. Take their reduced form 
to be Bax, = x, with No, and Dp, relatively prime. Thomas Clausen and Karl 
von Staudt independently published a complete determination of D2; in 1840. Clausen 
first published the theorem without proof in the Astronomische Nachrichten, with a 
promise to publish the proof. Soon thereafter, von Staudt published a proof,*! ending 
his paper by remarking that many years earlier he had communicated the result to 
Gauss and later, upon seeing Clausen’s one-paragraph announcement, he (von Staudt) 
decided to publish his proof. Clausen did not publish his proof, assumedly because of 
von Staudt’s paper. 

The theorem in question states that for primes p, Box + 7 ,— 110% ; is an integer. 
In particular, D2, is the product of primes p such that: p — 1|2k and 6| Do,. 

In 1911, Ramanujan published his first paper,’” and it contained a suggestion for a 
proof of the Clausen-von Staudt result. The topic of this paper was Bernoulli numbers, 
and much of it was rediscovered material, but Ramanujan took a fresh perspective in 
approaching the topic. He wrote that several number theoretic properties of Bernoulli 
numbers would follow from properties of specific series. In this connection, he 
observed that if it could be proved that 


1 1 1 
2 oe 
(= x+4 x+6 ) 


could be expanded in ascending powers of + with integer coefficients, then it would 
follow that 27 (224 — 1) Bok was an integer. 

In fact, Ramanujan stated the Clausen-von Staudt formula as a result on series. 
However, this result had already been proved by K. Schwering, a student of Weier- 
strass and Kummer at Berlin. Schwering’s proof involved infinite series.*? He first 
showed that 


1 1 By , Ba , Bo , Bs | aS (k — 1)! 
x | 2x2 ° x3" x x9 ~ Le kx(x +1) +2) +k—-1)’ 


(20.85) 


k=1 


derivable by combining (10.54) due to Stirling with (10.75) due to de Moivre. Note 
that in (20.85), the series on the left-hand side is an asymptotic series and the series 
on the right-hand side is an inverse factorial series that converges for x > 0. When 
the latter series is expanded in powers of x, the result is an asymptotic series and 
the different powers of + can be equated. Unlike Stirling and de Moivre, Schwering 


40 Clausen (1840). 

41 von Staudt (1840). 
42 Ramanujan (1911). 
43 Schwering (1899). 
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regarded the first Bernoulli number as as we do today. Thus, the formula of 


1 

2? 
Schwering corresponding to (20.85) had —s5 
side, so that the denominator on the right-hand side was k(x + 1)---(« + k). Of 
course, this difference does not affect the calculation. 

At this point, Schwering considered the term on the right-hand side of (20.85) (in 


Stirling’s notation), 


as the second term on the left-hand 


(k— 1! 
kx(x + Dw 4+2)---@+k—1) 


(20.86) 


He observed that if k were a composite integer greater than 4, then k would divide 
(k — 1)!. Thus, (20.86) could be expanded in powers of + with integer coefficients, 
because (20.86) could be expressed as 


ge es eee 
k xk " x x x 
ate! (1 at oo Jen(i kK=1 (k= 1" ) 
kxk x | x2 x x , 


The idea of Schwering’s proof of the Clausen-von Staudt formula was that since 
Box was the coefficient of oe <7 on the left-hand side of (20.85), it was sufficient to 


show that, modulo the integers and for primes p, the coefficient of the odd power <seFT 


1 
was DV p-1l2k s 
For k = 4, Schwering noted that the term in (20.86) became 


3 
; 20.87 
2x(x + 1)(x + 2)(x + 3) 
also note that 
1 1 1 ; 1 1 ae 
= — } { forse mod 2. 
x(x +1)(e+2)(%4+3) x2Q02-1) x4 x2 x4 

(20.88) 


Thus, the term (20.87) did not contribute a fraction to the odd powers of i because 
these powers were all even integers that cancelled with the 2 in the denominator to 
produce an integer. We see that for the case in which k was composite, Schwering did 
not have to consider the terms in (20.86). For the case k = p, a prime, he noted that 
the congruence 


oe DEL Dee pe ly SGP? =): ano p (20.89) 
and Wilson’s theorem 


(p —1)!=-—1 mod p 
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implied that 


(pH 1)! =i 


Bae 1G ian Fern 20. 
px(x +1)(x+2)--- a+ p— 1) px(xP—! =) (20.90) 


Note that Lagrange’s proof of (20.89) and Wilson’s theorem appear in Section 10.8. 
We may now write 


=1 eb fod se vie. ie 
px(xP-2— 1) px \xp-l * 2@-D  3@-D 


Therefore, if p — 1|2k, then the term (20.86) contributes —i+ integer to the 


coefficient of a =r. Thus, since By, is the coefficient of ik <7 on the left-hand side 


of (20.85), 


1 
Bug = - es — + integer, 
p—1|2k 


completing the proof. 
In his 1911 paper, Ramanujan wrote that the theorem now credited to Clausen and 
von Staudt, could be deduced from the proposition: The series 


ae 1 1 1 1 
2x2 ° (x +12 ° w+2)2— x 6(x3—x) 
| ! | ! | 
5005 —x) | TaTax) 
where 5,7, 11,13, ... are primes greater than 3, can be expanded in ascending powers 


of 7 with integral coefficients; note that this proposition can be inferred from 
Schwering’s work. Samuel Wagstaff’s paper“ gives a full discussion of this matter 
and other results that Ramanujan rediscovered, with a complete list of references. 
Ramanujan indicated a proof that 22k (22* — 1) Be with k = 1,2,3..., were 
integers. Given his familiarity with series, Ramanujan most probably knew that these 
were tangent numbers, and thus integers, a fact first proved by Euler as we saw in 
equation (16.84). However, it appears that here Ramanujan wished to indicate a proof 
along the same lines as the proof of the Clausen-von Staudt theorem. He briefly stated 


that, based on the asymptotic series for ree , one could derive the equation 
1 es ane ofl IB Si 1 
4D KAA eb eR E10 
1 Bo B4 Bo 
= 2(2°-1 2324-1 2°(2°-1 20.91 
Ox 2x2 ) Ax4 6x° 


44 Wagstaff (1981). 
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From Boole’s formula, originally due to Euler, we may derive (20.91); in this 
connection see Section 21.8 on Boole’s symbolic method. In Boole’s formula, 


(oe) 


yt _! SL = 2") Bon a1 f 
2 ( Hi EOE GEO RD apa pet" 


w=1 


take f(x) = + and then subtract + from each side; finally, change x to 5 to arrive 
at (20.91). 

Next, express two times the left-hand side of (20.91) as an inverse factorial series 
to show that it can be expanded in ascending powers of i with integer coefficients. 
To find this expression in terms of inverse factorials, we can follow a method that 


Ramanujan himself often employed, for example in his paper, “A series for Euler’s 
constant y.”*> Now since 


we have 
1 ; 
2 u d | cee =) (3 portly pat? Jat 
x+2 x+4 x+6 0 


is oe 
= / ray dt. (20.92) 
0 


Using integration by parts or from (17.24), we may write 


1 flix (1-t\" n! 
2 / le) Gee (20.93) 
2 Jo 2 (e+ 2) + 4)+++@ + 2n+ 2) 
Applying (20.93) and the expansion 
Pte ese te : 1 Bs ates 
fr = » O<t<ti, 
2 2 1-44 1+t 


we see that the right-hand side of (20.92) may be given as 


143 iN ieee ane a. 
dt=- >) ——) dt 
I 1+t 5 | ( 2 ) 
n=0 
CO 


>> n!} 


= (x +2)(x +4)---(@e+2n+2) 


45 Ramanujan (1917). 
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Thus, we have shown that the left-hand side of (20.92) can be expanded as an 
inverse factorial series: 


1 1 = n! 
2/ | Jey 
x+2 x+4 x+6 = (x + 2)(x +4)--- (x + 2n 4+ 2) 
(20.94) 


Moreover, since 


n! Se ne 14 nt? = 
(x +2)(x +4)---(e+2n4+2)  xntl " x x ; 


we see that the right-hand side of (20.94) can be expanded in powers of + with integer 


coefficients, thus confirming that the tangent numbers 2k (22k —1) Bx k = 1,2,3,... 
are integers. 

Ramanujan also remarked that 2( 
could be proved that 


ee 1) Box could be shown to be an integer if it 


1 1 1 
a1 Ga Grae (20.95) 


was expandable in powers of + with integer coefficients. He was able to show that the 
expression (20.95) was equal to 


I oe) Box 
_ g?k _ 4 20.96 
a2 ) ( ) 3 ( ) 


starting with de Moivre’s formula (10.75), taken in the form 


1 1 1 Le Be ee Bog 
| bee ! 20.97 
(x + 1)2 (x + 2)2 (x + 3)2 x 9x2 se x2k+1 ( ) 


He then changed x to 5 in (20.95) and subtracted half of the resulting equation from 
(20.97) to accomplish this. Note that this result also follows from Boole’s formula. 

The next step would be to show that the series (20.95) can be expanded as an inverse 
factorial series with integer coefficients, and this can be done by taking the derivative 
of (20.94), so that 


oe) 


or es n| ( eee : ) 
(Sek) es (x +2)(x + 4)---(*# + 2n4+ 2) x+20 x+2n+2/)’ 
(20.98) 


a step Ramanujan did not present. However, he made similar arguments in his paper on 
Euler’s constant. We can now see that the right-hand side of (20.98) can be expanded 
in ascending powers of + with integer coefficients. For example, we have 
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n! n!} Dee Oe 
= Ls ee , 
(x +2)(x +4)---(W+2n +2) x2n4?2 x x 


and each of the (1 + 2) (1 + 2 ae one (1 + cae can be expanded in a series 
of the appropriate type. This proves that 2(27* — 1) Box is an integer, a result that 
actually also follows from the Clausen-von Staudt theorem because we can verify that 
if p — 1|2k, then p|22* — 1. Thus, 2(27k — 1) cancels every prime in the denominator 
of Bo. Ramanujan was surely aware of this, but here he clearly wished to use purely 
series considerations to demonstrate that 2(22* — 1) Bo, was an integer. 

Recall that Euler had shown that 20% — 1) By, k = 1,2,3,... were coefficients 
in the power series expansion of csc x and were therefore integers.*© The reader may 


also see the discussion in Section 16.8 and, in particular, equation (16.86). 


20.8 Exercises 


(1) In his 1671 letter to Collins, Newton expressed his result on )7?_, burke 28 
follows: 


Any musical progression § - 54 - pyc * pac * Fade ete. being propounded whose last 
LI T T T 


term is g: for ye following operation choose any convenient number e (whither whole 
broken or surd) which intercedes these limits mt and ./mn; supposing b — ze to bee 


m, and b+ 5¢ to bee n. And this proportion will give you the aggregate of the terms very 


near the truth. : 
e+5C ¥ a : 
+- to ye Logarithm of 7, so is $ to ye desired summe. 
e= xe 


As ye Logarithm of 


Verify Newton’s approximation, stated in modern terminology: Let m = 
b— scandn =d+5candd=b+(p— 1)c. Then 


Dp 


3 a ay aln(*) 
we 1. : 
b+kc ia (#) 


k=1 


e—5xC 


2 

where 5, < e < ./mn. See Newton (1959-1960) vol. 1, pp. 68-70. The 
editorial notes on p. 70 are helpful. 

(2) Suppose p = f “In y andg = f[ In x, where the symbols dx, dy denote 


partial differentiation. 


(a) Show that p+ q =Inx In y+C. 
(b) Take y= 1—x, 0 < x < 1. Show that 


[o,2) 
p=->lS and q=-)0 5. 
n=1 


3 


46 Eu. 1-109 § 223. E 212. 
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Letx > ltogetC=—-)°™ | 4 = sy Thus, 
om) 
Lig (x) + Lig(y) = roe In x In y, (20.99) 


where Liz(x) = 7, om 
(c) Take y = x — 1. Observe that 


1 1 
Iny=Inx +in(1~*) =Ins aa me ca etc. 


Show that 


1 1 

p= s(n x) + Lin(~) and g = —Liz(—y). 
x 

Deduce that 


2 


I y 
SS +1 In —=. 
p+4q ge Te 


(d) Deduce from (c) that for a = Y5=1 


2 
Lin(a) — Lin a= = In a In(aJa). 


The results and methods above are from Euler’s paper, written in 1779 and 
published in 1811, devoted entirely to the topic of the dilogarithm. By using 
partial derivatives, he made the proof of (20.99) somewhat shorter than his 
1729 proof of the same result. In 1735 Euler had also been able to evaluate 
€(2); he made use of this in his 1779 paper. See Eu. I-16-2 pp. 117-138. 

(3) Let s = 1" + 2” +3" +--+ x”. Use Euler-Maclaurin summation to prove 


Z ntl x” 1 - xn 1 7 xn-3 1 7 nS 
Hel 2 OAL) 6 4\3) 30 '6\5/) 42 
Lax! ak fay Sate fa Weel 
8\7) 30  10\9) 66 12 \11) 2730 
# 1 (n\ 7x"733 1 n\ 3617x"7}5 F 
ae + etc. 

14 \13 6 16 \15 510 


Euler gave this formula in his 1736 paper (Eu. I-14 pp. 108-123. E 47.) and 


specifically listed the sums for n = 1, 2, ..., 16. Note the explicit appearance 
of the Bernoulli numbers, é 0 ran tau an in Euler’s presentation of 


the formula. Naturally, Euler wrote 
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1 (;) n(n—1)---nN—k+1) 
a as ; 
k+1\k (eo) 


(4) Verify Euler’s computations leading to the value of y to fifteen decimal places. 
He took n = 10 in (20.4) and determined that 


1+ : Fees : = 2.9289682539682539. 
2 3 10 


He also knew that In 10 = 2.302585092994045684. He used precisely the 
terms in (20.4) so that he had to calculate 


1 1 1 691 1 


20 1200 ° 1200000 32760 x 1012-12 x 10/4" 


From this he obtained 


Const. = y = lim (>: z ~ Inn) = 0.5772156649015329. 


noo 
k=1 


This is just one example of the kind of numerical calculation that Euler 
undertook on a regular, if not daily, basis. 
In this connection, also prove the inequalities of Mengoli: 


1 1 1 1 1 


1 
eneea sf I rows . (20.100 
n+1 n+2 ap Rh n+1 np —1 ( 


Mengoli proved these in his Geometria Speciosa of 1659. See Hofmann’s 
article on the Euler—Maclaurin formula, in Hofmann (1990) vol. 1, pp. 233- 
240, especially p. 237. 


(5) Use the Euler—Maclaurin formula to show that 
— 1 
ys a = 1-202056903159594, 
k=1 
ull 
ot a = 1.08232323371 10824. 
k=1 
These results and the evaluation of y in the previous exercise are in Euler’s 
1736 paper (Eu. I-14 pp. 118-121). 
(6) Verify Euler’s formal computations to obtain a formula for the alternating 


series 


S(x) = f(x) — fe +b) + f(x + 2b) — ete. 
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From s(x + 2b) — s(x) = —f(x) + f(x +b), deduce that 


2b ds 4b? d*s 


LOE) a ae On gee 
Thus, 
b aes 2 4b? d? 
as | pated. + etc. = oes oe + etc. (20.101) 
I! dx = 2! dx? 1! dx 2! dx? 


Assume with Euler that 


ty ae + etc. 
and substitute in (20.101), equating coefficients, to get 


Bdf bP F 


b° 5 17b’ 7 
2s = Const.+ f(x) 4 eae oy 
2 dx 


4! dx3 6! dx> 8! dx! 
 1S5B def 2078b  aMvf -AR227bY af 
" 10! dx? 12!) dx" "141 dx 


etc. (20.102) 


Euler used this formula to compute the series 


1 1 1 1 
+ etc. 
x x+tb x+2b x+3b 
He applied it to 
wv 1 
=1 
4 3a 202> “Ah 
by taking x = 25,b = 2 and then computing | — ; Si a separately. 


In this way he obtained a value of 7 correct to eleven decimal places. The 
summation formula (20.102) is often called the Boole summation formula, 
though Euler had the result a hundred years before Boole. This work is in 
Eu. I-14 pp. 128-130. 


(7) Using the method of Exercise 6, show that 
POF ee AIG 2) He (20.103) 
_ 1 ie (1 — 22") Bo, q2n-l ¢ 
=5f@+)) eal geek (20.104) 


n=1 


Although this is not Euler’s notation, he pointed out the connection between 
the terms of the series and the Bernoulli numbers. The paper appeared in 
1788, though it was presented to the St. Petersburg Academy in 1776. See 
Eu. I-16-1 p. 57. 
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(8) Show formally that when m is a positive integer 


a” —(a+b)" + (a+ 2b)" — (a+3b)"4+.--- 


7 qn (2? —1)Bo & aly — Oh —1)By 2) at—3p3 _... 


2 2 1 4 3 
"+! ~1)Brntt pm 
m+1 : 


For a = 0,b = 1 this gives 


Buti Qee A) 
m+1 . 


This also follows from the formula (20.104) above. Euler used this obviously 
divergent series to prove the functional relation for the zeta function. See 
Eu. I-15 p. 76. 


20.9 Notes on the Literature 


Hardy (1949), a work on divergent series, gives an excellent treatment of the work 
of Jacobi and Poisson on the Euler—Maclaurin. Cohen (2007) presents a modern 
treatment of the E-M and its extensions, particularly to sums involving periodic 
arithmetic functions. 
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Operator Calculus and Algebraic Analysis 


21.1 Preliminary Remarks 


The operator or operational calculus, the method of treating differential operators 
as algebraic objects, was once thought to have originated with the English physicist 
and electrical engineer Oliver Heaviside (1850-1925). Indeed, Heaviside revived and 
brilliantly applied this method to problems in mathematical physics. But the basic 
ideas can actually be traced back to Leibniz and Lagrange, who must be given credit 
as the founders of the operational method. With his notation for the differential and 
integral, Leibniz was able to regard some results on derivatives and integrals as analogs 
of algebraic results. The later insight of Lagrange was to extend this analogy to infinite 
series of differentials so that, in particular, he could write the Taylor expansion as 
an exponential function of a differential operator. In fact, this formal approach to 
infinite series appeared in the work of Newton himself. For Newton, infinite series 
in algebra served a purpose analogous to infinite decimals in arithmetic: They were 
necessary to carry out the algebraic operations to their completion. Newton’s insightful 
algorithms using formal power series were of very wide applicability in analysis, 
algebra, and algebraic geometry; their power lay precisely in their formal nature. 
Thus, the algebraic analysis of the eighteenth century can trace its origins to Newton’s 
genius. A branch of algebraic analysis focusing on the combinatorial aspects of power 
series was developed by the eighteenth-century German combinatorial school. 

In a letter of May 1695 to Johann Bernoulli,! Leibniz pointed out the formal 
resemblance between the expression for the nth derivative of a product xy and the 
binomial expansion of (x + y)”. For n = 2, for example, Leibniz wrote 


(x + yy’ = 1x74 2xy + ly’, d’,xy = lyddx + 2dydx + |xddy. (21.1) 


He made similar remarks in a September 1695 letter to |’ H6pital,” in which he used 
the symbol p for the power (or exponent) so that the analogy would be even more 
evident. Thus, he denoted x” by p”x, so that he could write 


! Leibniz (1971) vol. 3/1, pp. 174-179, especially p. 175. 
2 Leibniz (1971) vol. I, pp. 297-303, especially pp. 301-302. 
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e 


e 4. e-e-l ._ 
p(x t+ y) = p*x- p’y+ —p@'x- ply + po ?x: pry te; 


1 1-2 
e- 


oan ae 
cae eae Syed’ yee 


d°(xy) = d°xd?y + od lx dly + D) 


The exponent e could be a positive or negative integer; when e = —n was negative, 
d~" denoted an n-fold integral. He mentioned that the e = —1 case was also noted 
by Johann Bernoulli. In fact, the formula in that case would be equivalent to Taylor’s 
formula. Finally in 1710, Leibniz published a paper? on this symbolic analogy. Later, 
Lagrange, inspired by this paper, extended the scope of this analogy by treating the 
symbol d, denoting the differential operator, as an algebraic object. In a paper of 1772 
presented to the Berlin Academy,* he gave the Taylor series formula as 

du, @us Bu & 


u(x +é)=u+——é 4 


du 
}... = edx® 212 
De? eB ae 88 rot eye 


where ( due i in the expansion of the exponential was understood to be a E". We 
observe that this point of view was not foreign to Euler. In a brilliant paper of 1750,° 
he suggested replacing the nth derivative by z” in order to solve a differential equation 
of infinite order. 

The generation of mathematicians after Lagrange chose, for clarity, to separate the 
symbol o from the function u upon which it acted. Lacroix, for example, in his 
influential work summarizing the eighteenth-century discoveries in calculus, wrote 
edx§ as ef ax u. It appears that the French mathematician L. F. Arbogast (1759-1803), 
the collector and preserver of important mathematical works, was the first to separate 
the operator from the object on which it operated. Arbogast’s method, published in 


1800, so impressed the English mathematician Charles Babbage that he wrote,° 


Arbogast, in the 6th article of his, “Calcul des derivations,’ where, by a peculiarly elegant mode 
of separating the symbols of operation from those of quantity, and operating upon them as upon 
analytical symbols; he derives not only these, but many other much more general theorems with 
unparalleled conciseness. 


Returning to Lagrange’s paper, we note that he observed that the difference operator 
could be expressed as 


Au(x) = ua + £) —u(x) =e ® 1, (21.3) 


In Arbogast’s notation, write Au = (ef tx - l)u. Lagrange applied this formula 
to obtain a formal though very simple derivation of the Euler-Maclaurin summation, 
and he extended this to situations involving sums of sums. Laplace used Lagrange’s 
operational method in his work on difference equations; he also attempted to give a 


3 Leibniz (1971) vol. 5, pp. 377-382. 
‘2 Lagrange (1772). 

5 Bu. 1-14 pp. 463-515. E 189. 

6 Babbage and Herschel (1813) p. xi. 
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rigorous derivation of Lagrange’s formulas. In his work of 1800, Arbogast applied this 
technique to numerous problems, including the solutions of differential equations. 

Then in 1808, Barnabé Brisson (1777-1828) independently applied this method 
to differential equations. A graduate of the Ecole Polytechnique, Brisson pub- 
lished his paper in its journal. In the 1810s, other French mathematicians such as 
F. J. Servois (1768-1847) and J. F. Francais (1775-1833) applied the methods of 
Lagrange and Arbogast to obtain some results on series. Servois also considered 
the logical foundation of the operator method. However, after the 1821 publication 
of Cauchy’s Analyse algébrique, effectively establishing the limit concept at the 
foundation of analysis, the operational methods ceased to be developed in France. But 
in 1826, Cauchy presented a justification of the operational calculus using Fourier 
transforms. Interestingly, in the early twentieth century, integral transforms were 
applied to rigorize the operator methods employed by the physicist Heaviside. In 1926, 
Norbert Wiener created generalized harmonic analysis and one of his motivations was 
to provide rigorous underpinnings for the operational method. 

During the 1830s and 1840s, important work in the operational calculus was done 
in Britain. Robert Murphy (1806-1843), Duncan Gregory (1813-1844), and George 
Boole (1815-1864) applied the methods to somewhat more difficult problems than 
those considered by Franais and Servois. Much of the British work was done without 
full awareness of the earlier Continental work, so that even as late as 1851, William 
F. Donkin (1814-1869) published a paper in the Cambridge and Dublin Mathematical 
Journal giving an exposition of Arbogast’s method of derivations. Thus, the British 
work was not a direct continuation of the work of Arbogast, Franais, and Servois; its 
origins and motivations lay in a more formal and/or symbolic mathematical approach. 

To understand the historical background of the British operational calculus, note 
that Britain produced a number of outstanding mathematicians in the first half of the 
eighteenth century, including Cotes, de Moivre, Taylor, Stirling, and Maclaurin. A 
large part of their work elaborated on or continued the study of topics opened up 
by Newton. There were also some good textbook writers such as Thomas Simpson 
and Edmund Stone who explained these developments to a larger audience. In the 
second half of the century, there was a swift decline in the development of mathematics 
in Britain. Mathematics was sustained at Cambridge by the almost solitary figure of 
Edward Waring, whose main interests were algebra and combinatorics, but he had few 
followers or students and little influence. Also, John Landen did interesting work in 
analysis, making a significant contribution to elliptic integrals. 

British mathematicians had long paid scant attention to the major mathematical 
advances in continental Europe: the calculus of several variables and its applica- 
tions to problems of mathematical physics developed by Euler, Fontaine, Clairaut, 
d’Alembert, Lagrange, and Laplace; major works in algebra produced by Euler, 
Lagrange, Vandermonde, and Ruffini; and also the brilliant progress in number 
theory made by Euler and Lagrange. In the early nineteenth century, Robert Wood- 
house (1773-1827) appears to be one of the first British mathematicians to attempt to 
expand the focus of mathematics at Cambridge. He leaned strongly toward a formal 
or symbolic approach and his main interests lay in the foundations of calculus and the 
appropriate notation for its development. He also wrote expository works in subjects 
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such as the calculus of variations and gravitation, and his efforts brought continental 
work on these topics to the notice of the British. In 1803, Woodhouse wrote The 
Principles of Analytical Calculation, a polemical work on the foundation of calculus. 
He reviewed the foundational ideas of his predecessors: Newton, Leibniz, d’ Alembert, 
Landen, and Lagrange. He rejected the limits of Newton and d’Alembert as well as 
the infinitesimals of Leibniz as inconsistent and inadequate, advocating instead the 
algebraic approach of Lagrange and Arbogast, though he disputed specific details. In 
the preface of his book, he wrote,’ 


I regard the rule for the multiplication of algebraic symbols, by which addition is compendiously 
exhibited, as the true and original basis of that calculus, which is equivalent to the fluxionary 
or differential calculus; on the direct operations of multiplication, are founded the reverse 
operations of division and extraction of roots,... they are still farther comprehended under a 
general formula, called the expansion, or development of a function: from the second term of 
this expansion, the fluxion or differential of a quantity may immediately be deduced, and in a 
particular application, it appears to represent the velocity of a body in a motion. 


Concerning the equal sign, =, Woodhouse maintained that in the context of series, 
this sign did not denote numerical equality but the result of an operation. So if —— 


1+x 
denoted the series obtained by dividing 1 by 1 + x, then 


1 
Steghe ae eis 
1l+x 


On the other hand, if st represented the series obtained when 1 was divided by x +1, 
then 


1 it 1 1 1 
xt1l x x2 


x3 x4 


Woodhouse remarked with reference to the two series that the equality : = at 
could not be affirmed. 

Woodhouse wrote other articles and books advocating his formal point of view. 
In his 1809 textbook, A Treatise on Plane and Spherical Trigonometry, he defined 
the trigonometric functions by their series expansions and made arguments for 
the advantages of the analytic approach over the geometric approach of Newton’s 
Principia. Though his 1809 treatise acquired some popularity and went into several 
editions, Woodhouse was unable to convert the Cambridge dons. Progress in intro- 
ducing the analytic approach into the curriculum was achieved mainly through the 
efforts of his students: Edward Ffrench Bromhead (1789-1855), Charles Babbage 
(1791-1871), George Peacock (1791-1858), and John Herschel (1792-1871). As 
students at Cambridge, they formed the Analytical Society in 1812 to promote and 
practice analytical mathematics; they decided to publish a journal called the Memoirs 
of the Analytical Society, though Babbage had wished to name it The Principles of 
Pure D-ism in opposition to the Dot-age of the university. Only one volume was 


7 Woodhouse (1803) p. II. 
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published; it appeared in 1813 and contained one article by Babbage on functional 
equations and two by Herschel, on trigonometric series and on finite difference 
equations. As the members scattered, the Analytical Society ceased to meet, though 
many of its members became fellows or professors at Cambridge. In any case, 
Babbage, Peacock, and Herschel remained friends. They translated an elementary text 
by Lacroix on differential and integral calculus and in 1820 published a supplementary 
collection of examples on calculus, difference equations, and functional equations. 
Their efforts gradually influenced the teaching of mathematics at Cambridge, leading 
to the acceptance of Continental methods. 

Even though Babbage succeeded Woodhouse as Lucasian Professor of Mathe- 
matics in 1828, he spent very little time at Cambridge. Consistent with his formal 
approach, he became interested in the mechanization of computation and spent the 
rest of his life on the problems associated with that. He first developed plans to 
construct a “difference engine,’ complete with printing device; he hoped it would 
eventually compute up to twenty decimal places using sixth-order differences. A 
Swedish engineer, Georg Schentz, used Babbage’s description to build a machine with 
a printer capable of computing eight decimal places using fourth-order differences. 
Instead of actually building a machine, Babbage himself went on to design a more 
elaborate computer, called the “analytical engine,” inspired by the study of Jacquard’s 
punched cards weaving machine. 

John Herschel lost interest in pure mathematics and became a professor of 
astronomy at Cambridge. So that left George Peacock to carry out the reform or 
modernization of the teaching of mathematics at Cambridge and more generally in 
England. In 1830, he first published an algebra textbook, later published in two 
volumes? in which he attempted to put the theory of negative and complex numbers on 
a firm foundation by dividing algebra into two parts, arithmetical and symbolical. The 
symbols of arithmetical algebra represented positive numbers, whereas the domain of 
the symbols in symbolical algebra was extended by the principle of the permanence 
of equivalent forms. This abstract principle implied, according to Peacock, that any 
formula in symbolical algebra would yield a formula in arithmetical algebra if the 
variables were properly chosen. Note that this approach excluded the possibility of a 
noncommutative algebra. 

Ironically, this algebraic approach to calculus taken by the British mathematicians 
of the 1820s and 1830s stood in contrast with the rigorous methods contempo- 
raneously introduced in Europe by Gauss, Cauchy, Abel, and Dirichlet. The next 
generation of British mathematicians, including Duncan Gregory, Robert Murphy, 
George Boole, Leslie Ellis (1817-1858) and others, were aware of the continental 
approach and yet they felt that their own methods had legitimacy. Early death 
prevented the talented Duncan Gregory from preparing a new foundation for this 
method. However, the symbolic method, even if lacking in rigor, had significant 
influence. The origin of some aspects of modern operational calculus and of the 
theory of distributions can be seen in the symbolic methods of Gregory and Boole. 


8 Peacock (1843-1845). 
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Moreover, some of the methods themselves were put on a more solid foundation 
through G. C. Rota’s twentieth-century work on umbral calculus.? 

The British symbolic approach served as the starting point for some significant 
developments: the symbolic logic of Boole and Augustus De Morgan (1806-1871) and 
the invariant theory of Boole, Cayley, and Sylvester. Consider, for example, Boole’s 
remarks in the introduction to his 1847 work, The Mathematical Analysis of Logic:'® 


They who are acquainted with the present state of the theory of Symbolical Algebra, are aware, 
that the validity of the processes of analysis does not depend upon the interpretation of the 
symbols which are employed, but solely upon the laws of their combination. Every system of 
interpretation which does not affect the truth of the relations supposed, is equally admissible. 


G. H. Hardy wrote in his book Divergent Series that the British symbolical math- 
ematicians had the spirit but not the accuracy of the twentieth-century algebraists.!! 
Nevertheless, there is at least one example of abstract algebraic work consistent with 
the standards of today: Hamilton’s theory of couples and of quaternions. The former 
laid a rigorous algebraic basis for complex numbers. And Hamilton reported that his 
1843 discovery of quaternions was guided by a determination for consistency, so that 
he left open the possibility of an algebra with zero divisors or with noncommutativity. 
It is noteworthy that around 1819, Gauss composed a multiplication table for 
quaternions,!* though apparently he did not develop this further. 

Papers by Murphy, Duncan Gregory, and Boole published between 1835 and 1845 
provided important steps toward the creation of concepts laying the groundwork 
for the eventual construction of abstract algebraic theories. Murphy, the son of a 
shoemaker-parish clerk in Cork County, Ireland, studied mathematics on his own, 
and his talent soon became known. In 1819, Mr. Mulcahy, a teacher in Cork County, 
published mathematical problems in the local newspaper; he soon began to receive 
original solutions from an anonymous reader. He was surprised to discover that his 
correspondent was a boy of 13. After this, Murphy began to receive encouragement 
and financial assistance to continue his studies. In 1825, some of his work was brought 
to the attention of Woodhouse; consequently, Murphy was admitted to Gonville and 
Caius College, Cambridge, from which he graduated in 1828. In an 1835 paper on 
definite integrals, Murphy introduced the idea of orthogonal functions, giving them 
the name reciprocal functions. In his 1837 paper,!> “First Memoir on the Theory of 
Analytical Operations,” he defined what he called linear operations and showed that 
their sums and products, obtained by composition, were also linear operations, though 
the products were not necessarily commutative. He stated a binomial theorem for 
noncommutative operations, and went on to consider inverses of operations, proving 
that the inverse of the product of two operations A and B was B~'A~!. Murphy 
also defined the kernel of an operation, naming it the appendage of the operation. He 
applied his theory mainly to three operations: the differential operator, the difference 


9 See, for example, Mullin and Rota (1970). 
10 Boole (1847) p. 3. 

'l Hardy (1949) p. 18. 

12 Gauss (1863-1927) vol. 8, pp. 357-361. 
13 Murphy (1837). 
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operator A, and the operator transforming a function f(x) to f(x + A). Thus, in his 
paper Murphy isolated and defined some basic ideas of a system of abstract algebra. 

Around this time, Duncan F. Gregory, descendent of the great James Gregory, also 
began to develop his mathematical ideas. Gregory was born at Edinburgh, Scotland, 
and graduated from Trinity College, Cambridge, in 1837. Even as a student, Gregory 
was interested in mathematical research and in encouraging British mathematicians 
to take up this activity. As a step in this direction, in 1837 he helped found the 
Cambridge Mathematical Journal, of which he was the editor until a few months 
before his premature death in February 1844. R. Leslie Ellis, who served as editor 
after this, wrote that Gregory was particularly well qualified for this position for “his 
acquaintance with mathematical literature was very extensive, while his interest in all 
subjects connected with it was not only very strong, but also singularly free from the 
least tinge of jealous or personal feeling. That which another had done or was about 
to do, seemed to give him as much pleasure as if he himself had been the author 
of it, and this even when it related to some subject which his own researches might 
seem to have appropriated.”!4 In addition, D. F. Gregory encouraged undergraduates 
to publish and permitted authors to publish anonymously so that they need not fear for 
their reputations. This journal was later renamed Cambridge and Dublin Mathematical 
Journal; it then evolved into the Quarterly Journal of Pure and Applied Mathematics, 
with editors including William Thomson and J. W. L. Glaisher. In fact, most British 
mathematicians of the period contributed papers to the CMJ, including Augustus De 
Morgan, J. J. Sylvester, George Gabriel Stokes, Arthur Cayley, George Boole, and 
William Thomson. 

By about 1845, research in operational calculus was no longer widely pursued. 
In the 1890s, however, Heaviside revived operational methods in order to solve 
differential equations occurring in electrical engineering problems. Heaviside may or 
may not have independently devised these methods, but he made at least one important 
new contribution: The use of the step function H(t) = 0 with ¢ negative, and H(t) = 1 
with ¢ nonnegative. By taking the derivative of this function, he obtained the Dirac 
delta function. In some situations, Heaviside also used the derivative of the delta 
function. Because these methods were so successful in solving problems in electrical 
engineering, mathematicians such as Wiener, Carson, Doetsch, and van der Pol made 
substantial progress toward putting them on a rigorous footing. 

Toward the end of the eighteenth century, algebraic analysis was taken in a 
different direction by the German combinatorial school founded by C. F. Hindenburg 
(1741-1808), Professor at Leipzig. Since combinatorial considerations were important 
in probability computations as well as in deriving formulas for higher derivatives 
of products of functions and of compositions of functions, Hindenburg saw that 
he could find relations between/among series through the use of combinatorial 
concepts. This school took as its starting point and inspiration Euler’s extensive use 
of series to tackle various mathematical problems, as set forth in his Introductio 
in Analysin Infinitorum of 1748. The combinatorial school played a significant role 


14 Bllis (1845). 
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in the overall development of mathematics in Germany;!> it included among its 
early members Christian Kramp (1760-1826), Gauss’s thesis supervisor J. F. Pfaff 
(1765-1825), and H. A. Rothe (1773-1842). Later, H. F. Scherk (1798-1885), Franz 
Ferdinand Schweins (1780-1856), August Leopold Crelle (1780-1855), Weierstrass’s 
teacher Christoph Gudermann (1798-1852), and Moritz A. Stern (1807-1894) made 
contributions to this tradition, and many of them were active in instituting educational 
reforms in Germany. In fact, it is very likely that Weierstrass chose to make power 
series the fundamental object in his study of analysis because of his early contact with 
Gudermann. Also, Riemann’s earliest research on fractional derivatives and infinite 
series was done while he was a student under Stern, though Riemann eventually took a 
completely different route as a result of his later association with Dirichlet and Gauss. 
The combinatorial school produced some interesting results useful even today, and 
their approach to infinite series is not without significance in modern mathematical 
research. 

Hindenburg believed, and his colleagues agreed, that his most important work was 
the polynomial formula he proved in 1779. A power series raised to an exponent is 
another power series: 


(1 + ayx + apx? + 3x3 +++." = 14 Ayx + Aox? + Agx? +--- 


Hindenburg’s formula expressed A,, in terms of a1,a2, ...,d,. De Moivre had already 
done this!® with m a positive integer in a paper of 1697. Leibniz and Johann Bernoulli 
also considered this case in letters exchanged in 1695. This particular case is quite 
useful; for example, it can be applied to give a short proof of Faa di Bruno’s formula, 
giving the nth derivative of a composition of two functions. Faa di Bruno stated this in 
the mid-nineteenth century without referring to the earlier proofs;!’ Arbogast offered 
a proof in 1800.!8 

Using Newton’s binomial theorem, Hindenburg extended his formula to fractional 
and negative m by expanding (1 + y)”, where y = ajx + a)x* +a3x°+---, 
and y” was obtained from the polynomial theorem for positive integral n. Part of 
Hindenburg’s achievement was to clarify the combinatorial content of the formula. 
De Moivre had given only the recursive rule for the calculation of Ay+1 from Aj. In the 
notation presented by B. F. Thibaut in his 1809 textbook Grundriss der Allgemeinen 
Arithmetik,!° Hindenburg’s formula would be expressed as 


n 
An = s @) pC 


h=1 


The symbol nC represented the sum of all products of h factors taken from 
a1,a2,...,Qn, SO that the sum of the indices in each summand was n. The symbol p 


'5_ See Jahnke (1993). 

16 de Moivre (1697). 

!7 Faa di Bruno (1857). 

18 Arbogast (1800) pp. 30-31, 310-313. 
19 Jahnke (1993). 
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stood for the coefficient associated with each summand, each summand consisting of h 
factors, and this coefficient gave the number of different permutations of the / factors. 


3 : 
Thus, © C stood for ava4 + ajaza3 + a3 and the number of terms in the sum was the 


number of partitions of 6 with exactly three parts. Therefore, p?C represented 


a 
21! " (173 


3 
a a2a3 + a3. 


The combinatorial school set great importance on this formula, overestimating its 
potential. Still, Hindenburg’s formula is useful in power series manipulation. 

In 1793, H. A. Rothe used Hindenburg’s formula to state the reversion of series 
formula as a combinatorial relation. Two years later, Rothe and Pfaff showed the 
equivalence of Rothe’s formula with the Lagrange inversion formula. In modern times, 
Lagrange’s formula has been regarded as more combinatorial than analytic in nature; 
in this respect, the combinatorialists were on the right track. Rothe also found one 
important terminating version of the g-binomial theorem, published in the preface of 
his 1811 book.”° In this formula, the coefficients of the powers of x are q-extensions 
of the binomial coefficients, now called Gaussian polynomials. It is possible that 
Rothe may have discovered these polynomials even before Gauss’s work of 1805, 
published in 1811. It would be nice to know Rothe’s combinatorial interpretation of 
these polynomials; he gave no proof or comment. In order to get an insight into the 
combinatorialists’ mathematical style, consider comment given by Thomas Muir in his 
monumental The Theory of Determinants in the Historical Order of Development:?! 


Rothe was a follower of Hindenburg, knew Hindenburg’s preface to Riidiger’s Specimen 
Analyticum, and was familiar with what had been done by Cramer and Bézout . ... His memoir is 
very explicit and formal, proposition following definition, and corollary following proposition, in 
the most methodical manner. 


Christian Kramp taught mathematics, chemistry, and experimental physics at Ecole 
Centrale in Cologne and in 1809 he became professor of mathematics and dean of 
the faculty of science at Strasbourg. He was a follower of Hindenburg and contributed 
articles to various journals edited by Hindenburg. In a paper of 1796,” he derived 
some interesting properties of Stirling numbers. One object of interest for him was 
what he termed a factorial: 


a"! = a(a+d)(a+2d)--- (a+ (n—1)d). 


He expanded this as a polynomial in a and d, and obtained formulas for the coefficients 
involving Stirling numbers. Denoting the Stirling numbers of the first and second kinds 
by s(n,k) and S(n,k), Kramp proved that?3 


20 Rothe (1811). 

21 Muir (1960) vol. I, p. 55. 
22 Kramp (1796). 

23 See Knuth (1992). 
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n+1 (k+1)! 
lsn+1,n+1—k)| =o (he jildhpl3ahplan.’ 


eae» (k+D)! 
Sm thm) =), (a) ji! QAp! BFR! GDR 


where the sums were over all nonnegative j such that jj +2j2+3j3+--:+kj =k, 
and where / = jj + jo+j3+--+-+ jg. Kramp also introduced the factorial notation, n!. 


21.2 Euler’s Solution of a Difference Equation 


In a very interesting paper of 1750,*4 “De serierum determinatione seu nova methodus 
inveniendi terminos generales serierum,” Euler used symbolic calculus to solve the 
difference equation 


y(x) — yx — 1) = XX). (21.4) 


In his solution, Euler applied to differential equations of infinite order the method 
applicable to differential equations of finite order; thus, his work is difficult to confirm. 
However, it is interesting to compare Euler’s results with the results obtained by 
conventional methods. By applying the Taylor series to expand y(x — 1), he had 


1 
ya -D)=y@)—y@+ Fy @-, 
and he could write equation (21.4) in the form 


dy 1 dy i dy 
dx 2! dx2 ° 3! dx3 


un Sry (21.5) 


Before working with equation (21.4), Euler took the case X = O and obtained 
the solution of (21.5) as a series }) a, sinnx. We can omit discussion of this portion 
of the paper, as the solution for this case is contained within the solution in general 
of equation (21.5). Euler viewed (21.5) as a nonhomogeneous equation of infinite 
order with constant coefficients and, in a highly unorthodox way, he applied the same 
technique that he had earlier applied to nonhomogeneous equations equations of finite 
order with constant coefficients. His method of solving such differential equations is 
discussed in our Section 14.6. We restate that result: Suppose a@1,@2,...,@, are the 
roots of the nth-degree polynomial P(x). Then with D = £, the equation 


P(D)y=X 


has the solution 
eanx 


eux ne _ 
ae Xd Bag he nx K dx. 21.6 
»= Pan) J° ea pean. We : ene) 


24 Bu. 1-14 pp. 463-515. E 189. 
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Euler observed that if ay on the left-hand side of (21.5) were replaced by z”, 


n = 1,2,3..., then the result would be 


2 3 
‘ane _ = 
eae Se 
The roots of P(z) = 1 — e-* were given by 2zin, where n was an integer. 


Thus, P’(e2” um) = 1; the solution of (21.5), an equation of infinite order, would be 
given by 


oe) 
y= f xar+ Cal erik x dx 4 ets | erty as). (21.7) 
k=1 


Euler combined the two terms in parentheses within the summation in (21.7) to 
obtain 


CO 

y= [x dx +2 Y (costa) f coscankxyx dx + sin Qarkx) f sin(2akx)x as) : 
k=1 

(21.8) 


here note that 


et?mikx — cos(Qakx) + sin(2akx). 


Observe that because its integrals are indefinite, (21.8) is not a Fourier series, 
though it may resemble one. Euler applied (21.8) to the difference equation 


y(x) -— ya@-1l)=Inx, y(0)=0. (21.9) 


Note that a solution of this equation is y(x) = InI’(x + 1). Substituting X = Inx 
in (21.8) gave Euler 


[oe] 
y= fom dx + ae (costarks i. cos(2akx)X dx 


k=1 


+ sin(2akx) / sin(2kz x) In x ax), (21.10) 
Euler calculated the integrals 


froxas = xine x, (21.11) 
1 1 ! ! 
fins cosmx dx = — sinmx (ins g 2 >) 
m 


1 1 ! ! ! 
Fg COSI (- 5 gt -), (21.12) 
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, 1 1 3! 5! 
J inxsinms dx = —— cosms Inx 4 an 
m 


T 
m2x2—m4x4 mx 


1, ( 1 2! 4! 6! 
+ —sinmx | ere 
m 


mx m3x3 


(21.13) 


He did not explain his method in obtaining (21.12) and (21.13), but certainly one 
way of deriving them is to use integration by parts. For example, 


Inx . 1 . dx 
Inx cosmx dx = — sinmx — — | sinmx — 
m m x 


Inx , cos mx 1 dx 
= — sinmx + aa) cos mx — 
m m 


m2x x2 


and so on. Euler noted that (21.12) and (21.13) implied that 


2oosms | Insxcosms dx +2sin mx f In x sin mx dx 


2 2! 4! 6! 
- ! Eee We exe t etc. | + a, cosmx + by, sinmx, (21.14) 


m2x 


where a, and b, were constants of integration. He next substituted (21.11) and 
(21.14) in (21.10) to obtain, with P a constant of integration, 


CO (oe) 
y=P+xInx-—x+ 2 (ax cos(2kmx) + by sin(2kzx)) pe Dkk — a ae i? 


(21.15) 


where he applied (16.31) to obtain the infinite series with Bernoulli numbers Bo,. 
Since y(1) = 0, he set x = 1 in (21.15) to obtain 


O=P “14 Yate C= ae 
1 


Euler noted that he had found C = 5 In2z in his differential calculus book. For 
Euler’s derivation of C, see (20.19) and the subsequent discussion. 
Euler was thus able to rewrite (21.15) as 


1 CO 
y=xInx-x+ 5 In2za + Ss (ax cos(2kmx) + by sin(2kzx)) 
k=1 


(oe) 
Ds or —2t — (21.16) 
fay 2K GK — Dx 
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Note that (21.16) is the equation for InI(x + 1); it has an indeterminacy in that 
az and by are not known. Of course, this is understandable because InT(x) is not 
completely determined by the two conditions given in (21.9): 


Inti +1) =Inx +InT (x), InT(1) = 0. 


Recall that the Bohr—Mollerup—Artin theorem required a third condition: that 
In I(x) be a convex function. Note also that (21.16) implies that 


x 
xX : oe eee Barer ae 
T(x + 1) =v Qn a er Gn cosnx+by sinnx) -e 12x 08s , 
e 


Tx+l)= lm x* e7* er Gn cosnx+by sinnx) . saa (21.17) 
and since 
Pith) ~ Varx"t2@" as x > 0, 


by (21.17) we have 


eX (an cosnxtby sinnx) fy as x > oo, (21.18) 
Thus, the convexity of In (x) has somehow allowed us to obtain (21.18). 
We add the observation that in 1923, the Danish mathematician Niels Erik N6rlund 
gave the solution to the equation y(x + 1) — y(x) = X (x) in an alternate form.” His 
result may be stated: 


yx) = lim (/ X(x)e™ dx — + X(x +5) a) , (21.19) 


s=0 


providing the limit exists. Sufficient conditions for its existence are that X (x), for 
all x > O, has a continuous derivative of order m that tends to 0 as x — oo; that 
the integral ie By (—t + [—t]) X™ (x +1) dt converges uniformly over x € (0,1), 
where B,,(t) represents the mth Bernoulli polynomial. 


21.3 Lagrange’s Extension of the Euler—Maclaurin Formula 


In his 1772 “Sur une nouvelle espéce de calcul,’ Lagrange set out to create a new 
symbolic method in calculus.7° As a first step, he expressed the Taylor series of a 
function u(x, y, z,...) of several variables as 


du 


du du 
uxtéinyt hi zts,...) Serer att at” 


25 Norlund (1923). Also see Norlund (1924). 
26 Lagrange (1772). 


21.3 Lagrange’s Extension of the Euler-Maclaurin Formula 609 


In his symbolic notation, Lagrange understood the numerator of the nth term of 

zt to represent (& a w Bente -..)" rather than (E 5 w a. -+)" In 
C Pp dx dy ee dx dy : 

effect, Lagrange was treating the derivative operator as an algebraic quantity. The 
d 

later notation of Arbogast makes this approach clearer,”” allowing us to write e° @ u 

du 1 du 

for Lagrange’s ea ® and (e& ie — 1)*u for his (ea é = (Nie It is easy to see that the 

last expression is the symbolic form of the Ath difference A*u, since we may write 


Au = u(x +£) —u(x) = (ef & = 1) ul. (21.20) 


It follows that the difference operator A can be identified with the operator e* #1 
and the repeated application of these operations yields 


, a 
Aus (é ie 1) it, (21.21) 


Following Leibniz, Lagrange noted that, given the derivative operator d, he could write 
2 
al oe ee 


A=, A*=2%,.... 


and 


Here [ * stood for an iterated integral and D? for an iterated sum. Lagrange applied his 
symbolic method to a generalization of the Euler—Maclaurin summation by expanding 
the expression in (21.21). He assumed the series expansion 


(e® —1) =@* (1+ Aw+ Bo* + Cw? + Da* +---) 


and took the logarithm to obtain 


Aln(e® — 1) —Alnw = In(1 + Aw + Bo”? + Cw® + Dat +--+). 


By differentiation, he found 


. 1 A+2Bo+3Co?+4Do> +--- 
a= = Ss ee, (21.22) 
ee —]1 (a) 1+A Bo? + Cae + Dot +--- 
Since 
e? _ 1 = 1 
eeo—-1 l1-—e@ wo |, w 4 : 


he obtained the equation 


27 Arbogast (1800) p. 350. 
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a ! cae wo (1+ Ao+B 24 Cy 4 ) 
S Oe6 a eae Ss 
w 


o 
Oo 2; 3 
=(1 ot pe) + 2Ba+3Ca* +--+). (21.23) 


Finally, by equating the coefficients of the powers of w, Lagrange found that 


a ADA oR 
ee ea) aca OtDp_ OtVE, 4 
2 2 73 5 pe ee 
i 3 pees) ee a 
{pa es VE Ay ete. (21.24) 
2 a8 ee ene re e 


Now because A~* = &%, Lagrange could replace A by —A in (21.21) to get his 
extension of the Euler—Maclaurin formula: 


f*udx* ie udx* || ode 
gh ga-1 
We observe that when 4 was changed to —A in (21.24), Lagrange denoted the 
changed values of A, B,C,... by a, B, y,.... The case A = 1 in (21.25) is 
immediately distinguishable as the Euler-Maclaurin formula. In the same paper, 
Lagrange then proceeded to derive a formula for repeated integrals in terms of sums. 


He rewrote 
(21.20) as 


Yu= [esaste (21.25) 


gA-2 


du 
€& — =In(1+ A)u 
dx 


or in Arbogast’s notation, 


d 
€ — =In(1+ A) (21.26) 
dx 
and more generally 
d* 
é* —_ = [In(1 + A)]*. (21.27) 
dx* 


By expanding the right-side of (21.27) as a series in powers of A, Lagrange obtained 
the coefficients of the expansion by a method similar to the one that gave him (21.24). 
Once again, he replaced 4 by —A to obtain 


Xr 


prudx* 1 422 is 
erg? ake Pe a ae (21.28) 


Oca. GeO Ha: , a 
30= 


2 2-37 2 2-3 3-4’ 
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_Q-3)@ @-2v,A-De A 


4 
ss 2 8 ae 


etc. 


When A = | in (21.28), Lagrange got the value of the integral in terms a sum 
involving finite differences: 


d 
Ee yu + wu+vAu-4 wu 4 yeu free, (21.29) 


This is exactly the formula communicated by James Gregory to Collins in November 
1670. Recall that Gregory most probably discovered this formula by integrating 
the Gregory—Newton interpolation formula. Lagrange may not have been aware of 
Gregory’s work, but he referred to Cotes, Stirling, and others who used similar, though 
not identical, results. Laplace may have found this result independently; he used it in 
his astronomical work. Indeed, this formula was sometimes attributed to Laplace, in 
particular by Poisson. 

Lagrange employed (21.29) to derive an inverse factorial series for In (1+ +), taking 
u= + and € = | to obtain 


1 1 1 1 
Inx=) o> ge va- bon? xA>— pets 


where 


1 1 1 1 
= = ; (21.30) 
x x+1 <x x(x + 1) 
a1 2 
At- = ; (21.31) 
x x(x+1)(*x4+2) 
1 2-3 
B= (21.32) 


, etc. 
x x(x + 1)(x + 2)(« + 3) 


Changing x to x + 1, Lagrange obtained a similar series for In(x + 1). He then 
subtracted the series for In x to obtain the desired result. 

Lagrange’s heuristic method was immediately welcomed as a powerful tool in 
discovering interesting and useful formulas. Laplace’s papers on finite differences 
in the 1770s discussed and used Lagrange’s symbolic method. Laplace thought that 
Lagrange’s formula A”u = (his — 1)"u could be rigorously established by the use 
of formal power series. He observed that 


d™u d™tly dnt2y 
ni, __ n | n+1 | n+2 | see 
A"u = sent | orga h | Ay nea | (21.33) 
for constant Ai, A2,.... He believed the problem could be reduced to proving that 


these coefficients were identical with the coefficients of the powers of h in the 
expansion of (e" — 1)". He noted that the constants A;,A2,... were the same for 
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all functions u, so he took u = e*. Then Ae* = e*t! — e* = e* (e" — 1) and, more 
generally, A”e* = e* (e! = Nig . Thus, 
(e* — 1)" Sh" + Ayh™*! + Agh™? + ..., (21.34) 


completing his argument. 

In 1807, John Brinkley (1766-1835), professor of mathematics at the University 
of Dublin and a mentor to Hamilton, presented in the Philosophical Transactions an 
interesting expression for the constants A1, A2,... 28 He noted that 


(c! = ny. = eth (1) ef Dh 4 (5) elt 2)h 
* (). -G)-~) 


— tee h 
n n h? 
a -(j)a-1+(5) a a7 SF eee 
+ (a -(j)a-0+ (5) a 2)” be Fess. (21.35) 


Furthermore, it was clear from the formula 


A" f(0) = fin) - (1) fn— 1+ (3) f(n—2)—-- (21.36) 
that the coefficient of w in (21.35) was A”0”. Thus, Brinkley had 
At n-+k 
= ; (21.37) 
(n+k)! 


Note that, by (10.67), these numbers are related to Stirling numbers of the second 
kind: 


(n+k)! Ap =n! S(n+k,n). (21.38) 


Brinkley (c. 1763-1835) studied at Cambridge, graduating senior wrangler in 1788. 
He was the first Royal Astronomer of Ireland and later became Bishop of Cloyne. It 
was to Brinkley that the 17-year-old Hamilton communicated his work on geometrical 
optics. Brinkley encouraged Hamilton by presenting his work to the Irish Academy 
with the legendary assertion, “This young man, I do not say will be, but is, the 


28 Brinkley (1807). See also Lacroix, Babbage, and Herschel (1816) p. 478. 
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first mathematician of his age.”?? Fittingly, Hamilton succeeded Brinkley as Royal 
Astronomer. 
Finally, we note some very interesting connections with the results discussed in this 
section. First, when the coefficients of A; in (21.33), given by Brinkley’s formula as 
1S k 
pga EE: (21.39) 
(n+k)! 
are substituted in (21.34), the result is the exponential generating function for the 
Stirling numbers of the second kind: 


(e* — ew SP ee sn we (21.40) 


We may also derive the generating function for Stirling numbers of the first kind 
from results given in this section, combined with Stirling’s (10.50) and (10.57). When 
Lagrange’s formulas (21.30) through (21.32), of which Stirling was well aware, are 
substituted in Stirling’s (10.50), we arrive at 


1 lL. tool 1,31 
D =—A-+-A A Pawn, (21.41) 
Zon Dae BP 


where D denotes the derivative with respect to z. Next, (21.41) may be written as 
= log(1 + A), 


an expression equivalent to (21.26). Thus, we can see that Stirling’s (10.51) and 
(10.52) give the formulas for 


(log(1 + A))? (log(1 + A))° 
2! Si 3! 


as series in powers of A. We may then generalize these formulas by using (10.57), 
with n = A, and apply 


am (<) ty Se (21.42) 
Z Z(z+1)-+-(@+m) 
to get 
m{1 
1 (es deEAD 74 nd (+) 
(—1)*D* (=) =(a)s 7 (<) =D) ema)" ——. 


m=h 


29 This remark appears in many places, including Robert Perceval Graves’s article about Hamilton in the 
Dublin University Magazine of 1842, vol. 19, pp. 94-110. 
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Furthermore, since by (10.60) s(m,A) = (—1)~* c(m,A), (21.43) implies that 


Xr OO Wi 
Ue) = ¥~ scm) =, (21.44) 


m= 


giving us the exponential generating function for the Stirling numbers of the first kind. 

Lagrange’s calculations produce another relation: between Bernoulli numbers and 
Stirling numbers of the second kind. To see this, note that the left-hand side of (21.22) 
may be expanded in terms of Bernoulli numbers: 


e? Px. we® 1 
ex —] o w\e®—1 
Ci a” 
=-()- B,(1) — —1 
OV n! 
oo g'-! 
=> 8) 
n=1 
n 


igs w 
See hg 
2d ml t dD! 


Now 8, (1) = 5 and B,(1) = B, whenn > 2; therefore, (21.23) can be rewritten as 


‘(545 ngs) (14-9 = nena 


where Ax is given by (21.39). Equating the coefficients of w”~! on both sides yields 
the required result: 


SA+n,A "\ BeAn— 
nS( n,A) eS kAn—k 
A(A + 1)-+-(A +n) a k! 
where 
SAA+K,A 
A= Oey) (21.45) 


Q+D G+) 


21.4 Francais’s Method of Solving Differential Equations 


Jacques F. Frangais, whose mathematical work incorporated some results from the 
notebooks of his late brother Francois, based his solution of ordinary differential 
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equations with constant coefficients*” on the relation between the Arbogast operator 
E and the differential operator D: 


E ¢(x) =¢(x +1) =e¢, (21.46) 


where D = £. We note that Frangais’s notation had 6 for D. Now if ¢ were a solution 
of the equation 


ae —ap=0, or (D—a)¢d = 0, (21.47) 
dx 


then Francais had D — a = 0 by the separation of the operator. By (21.46) he had 
E =e? =e* or Ek = e and hence 1 = e“* E~*. He then used this relation to solve 
(21.47): 


b(x) = 1p(x) = eX E*6(x) = e* (x — bk) 


or 


p(k) = P(e". 


Thus, d(x) = Ce“, where C was a constant, and the differential equation (21.47) 
was solved. To solve the general homogeneous differential with constant coefficients 


D'p +a,D"'$ +--+ + ang = 0, 


Frangais separated the operator and factored the nth degree polynomial in D to 
obtain 


(D — a1)(D — a2) +++ (D — an) = 0. 
This gave him the n equations: 


D—a,=0, D—a,=0,..., D-—a, =0 
whose solutions he expressed as e*!*, e%2*, ...,e%*. Note that this is an exhaustive 
list of the independent solutions, under the condition that a1, a2,...,@n are all 
distinct. Francais also applied the operational method for the summation of series. 
He found a series for 7, reminiscent of a result obtained by the Kerala school. In a 
paper of 1811, he started with Euler’s series 


ud Nog 3a 4 ay 5 ae + 
—a = sin a — —= Sin 5a sin J@ sin /@ ae ee 
4 32 ' §2 72 


30 Frangais (1812-1813). 
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He rewrote this as 
say —1 
1 1 
= (ev! eee) = iis eal) as ages yg eT) See Se 
(21.48) 


He then set a./—1 = D, e? = E to obtain 


18 1 1 
Da (h a Eo | ieee Ohmi E> — E~> 
7D = ( art are ) 


Frangais next applied this operator equation to @(x), so that 


1 
50x) =O +1) -6@—D)— FF O@+3)—O&—) 40. 


Recall that E¢(x) = (x + 1). Taking ¢(x) = x, he obtained Leibniz’s formula. For 
P(x) = i, he found 


a oe 1 1 1 wi 1 1 1 
4 x2 \y2-1 3\x2—32] ' 5 \ 42-52 T\x2— RP) ! 


Putting + = —a, he could rewrite as 


wT 1 1 1 ad 1 1 1 
4° l+a 3 14324 5 14+52a 7 14+7a 


Then again, by taking @(x) = Inx, Frangais obtained the formula 
xz 1 x+1 1 x +3 1 x+5 
~:-=lIn In In 
2 x x-1 32 x—3 52 x= 5 


Finally, to derive another interesting series, he set a/—1 = + in (21.49) and then 
integrated, obtaining 


oA 1 1 
GT ea Oe eS 


21.5 Herschel: Calculus of Finite Differences 


In the appendix to their English translation of Lacroix’s book,*! Babbage and Herschel 
included a large number of examples on functional and difference equations, some of 
which were original. Like his followers, Herschel showed much manipulative ability 


31 Lacroix, Babbage, and Herschel (1816). 
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of an algebraic kind. To see some of Herschel’s work on difference equations, consider 
the equation 


Ux4{uy — AUy4] —Uy) +1 = 0, 


first given in a paper of Laplace but for which Herschel found a new solution. He 
differentiated and rewrote it as 
du x du x+1 


(a+ uxt) (a — ux) ae = 0. 


He solved for a in the original equation and used this value in the second equation 
to find, after simplification 


dux+1 du, =, or A f dux =a 


1+u2,, l+ue Leu 


where A was a constant depending on a. Solving this simple difference equation, he 


had 
/ ay = Ax+C, 
1+u2 


where C was an arbitrary constant. After computing the integral, he got u, = 
tan(Ax + C) and therefore 


ux +tanA 
1—u,tanA 


Uxy4) = tan(Ax +C+A)= 


At this point he rewrote the original difference equation as 


so he could conclude that tan A = 1 or A = tan! i. Thus, Herschel obtained the 
result 


1 
uy = tan (: fan ee c) : 
a 


To see an example of Herschel’s symbolic approach, take an analytic function f(x), 
and let 


¥(e) = Ag Ait + Aat? + + Age? Bee, 


Herschel wished to find an expression for A,; he started with the Taylor expansion 


[scl ee 0) ee 
i (e 1)4 foo 1)°4 ; 


fe) = fd)4 
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and noted that for x > 1, the coefficient of t* in f(1) was 0; in 


/ 
1 
Fei), 
1 
it was 
f'd) 1 _ fd AO* 
1 1-2---x J 1-2---x’ 
and in 
eu) t 2: 
——(e'— 1 
ore 2D 
it was 


f"d) A20* 
ey eae area 


and so on. He could conclude that 


! f'W FD) 5 
A, = ——_ 1) +0" 3 AO’ + AO’ ++ |}. 
" a (to 1 1-2 
He then wrote, “Let the symbols of operation be separated from those of quantity, 
and we get 


Av= pa (rH | FON f ay. jo - ” 


 4.2...% 1 1-2 1-2eeex ¢ 


Herschel apparently saw that taking a function of an operator was somewhat 
problematic, commenting that it should be understood to have no other meaning than 
its development, of which it is a mere abbreviated expression.” 


21.6 Murphy’s Theory of Analytical Operations 


Murphy began his 1837 paper** on analytical operations with the statement, “The 
elements of which every distinct analytical process is composed are three, namely, 
first the Subject, that is, the symbol on which a certain notified operation is to be 
performed; secondly, the Operation itself, represented by its own symbol; thirdly, the 
Result, which may be connected with the former two by the algebraic sign of equality.” 
He defined several operations. For example, he denoted by W the operation changing 
x to x+h, and by A the operation subtracting the subject from the result of changing x 
to x + h in the subject. He wrote these operations as 


[fay = fa+h), [F@)JA= fe +h)— fF). 


32 Murphy (1837). 
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The operations themselves could be algebraically combined. Thus, ¥ = A + 1, 
where | was the operation under which the subject remained the same. Murphy defined 
the linearity of an operation: 


[f(x) + OY = [fF @)1Y + [ho @)]. 


He called two operations fixed or free, depending on whether they were noncom- 
mutative or commutative in the given situation. Thus, 


[x] = (x? Yh = (x +n)", 
[x"] Wx = [xe +h)" = x(x +h)", 


so that x W ~ Wx; but for a constant a 


[x"]aW = [ax"]}W = a(x +h)’, 
[x"|Wa =[(x+h)"]a=a(x+h)", 


so that aW = Wa. 
He also stated a noncommutative binomial theorem: When @ and 6’ were fixed 
operations, 


(0 a 6’)" = eg”) ne g@—De’ a g"—2) 9/2) os as eg/@—-D ot: ge), 


Here the term 0“"—6’ represented the sum of n terms formed by placing 6’ at the 
beginning, at the end, and in all the n — 2 intermediate positions of the expression 
0-0---6 = 6""!, Similarly, 0-6’ signified a similar sum of nn) terms and 
so on. 

Murphy carefully defined some important algebraic concepts, such as the inverse 
and the kernel. Concerning the inverse: “Suppose @ to represent any operation which 
performed on a subject [u] gives y as the result, then the inverse operation is denoted 
by 6~!, and is such that when [y] is made the subject u becomes the result.” The kernel 
was called the appendage, denoted by [0]6~!. Murphy showed, for example, that if d, 
denoted the derivative with respect to x, then [O]d- ' consisted of all the constants. To 
prove this, he took [O]d, ' = g(x), implying that [6(x)]d, = 0, and hence 


[o(x)]dz = 0,[6(x)]d; =0,.... 
Murphy then employed Taylor’s theorem, 


— | do | h* ap | 
db +h) =b@) the +5 gate 


to obtain d(x +h) = (x), meaning @(x) was a constant. Here Murphy assumed 
without comment that ¢ was analytic. To find the kernel of d-”, he observed that 


[O]d-* = [0jd-'d-! = [e]dz! 4 [O]dy! = ex +e’, 
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and, more generally, 
[O]d-” = Ayx™ 1 4 Anx™ 27 4.0.4 Ay. 


In another interesting example, Murphy let the subject be f(x + y); let Vx be the 
operation under which x received an increment h; and let V, be the operation under 
which y received an increment of h. Then, obviously, [ f(x + y)](¥, — Vy) = 0, and 
therefore f(x + y)(Ay — A,) = 0, so that f(x + y) was a value in [0](Ay — Ags 
He explained that (A, — A x)! could be expanded as 


(Ay — Ay)! = AS! + AV7Axy + APPAT + AFAR +. , 


so that 


[O}(Ay — Ay)~' = [0](AS' + AS7Ay + ASPAY +---). (21.50) 


Murphy then explained how to derive the Gregory—Newton interpolation formula from 
(21.50). He noted that [O]A;! was a function independent of y, so he had [o]A5! = 
(x), where @ was an arbitrary function of x. Similarly, [O]Ay* = p(x) - a where 
the appendage was omitted without loss of generality. Then again, 


yy —A) 


[0]A,* = $6) - 35 


for 
[yy — bh) Ay = (vy +h)y — yy — hh) = 2hy. 


By a similar argument, 


y(y — h)Qy — 2h) 


[0]A5* = (x) - ieee eee 


’ 


and so on. In this way, Murphy obtained the relation 


mits. cae yy-h) 2 
LOMAS Rg SO) AOC) ta ga SAO) 
—h)(y —2h 
ae ae 10, eyes eee 


1-2-3-h3 


and “since f(x + h) is included in this general expression, the particular form to be 
assigned to the arbitrary @(x) is known by making y = 0, which gives @(x) = f(x).” 
Thus, Murphy had the Gregory—Newton interpolation formula, 


y(y—h) y(y —h)\(y — 2h) 


A3 Bese: 
1-2-h 23-4 OI 


fe+y) = f@)4 Af (x) ! A? f(x) 4 


Note that this is equivalent to (9.7). From this he derived the binomial theorem as Cotes 
had done, and perhaps James Gregory before him. Murphy took f(x) = (1+ b)’, 
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h = 1 and observed that A” f(x) = (1+ b)*b”" so that the binomial theorem followed 
after dividing both sides of the equation by (1 + b)*: 
Mya Dy 2 OS iy 2g 
1+b)¥ =1+4+ yb4 b* 4 b 
ee 5 ine [35-8 


21.7 Duncan Gregory’s Operational Calculus 


Duncan Gregory published many papers on operational calculus, illustrating the power 
of the method by elegant derivations of known results.** Gregory’s proof of Leibniz’s 
formula for the nth derivative of a product of two functions began with the observation 
that Euler’s proof of the binomial theorem 

(n — 1) n—2p2 n(n — 1)(n — 2) 


n 
by? = q" n—Ip n—3p3 ay 
(a+b)" =a’ +na + 7) ee ea + 


required that n was a fraction. More importantly, Gregory wrote that a, b should satisfy 


(1) The commutative law, ab = ba, 
(2) The distributive law, c(a +b) =ca+cb, 


(3) The index law, a” -(a") =a™*". 


Gregory added, “Now, since it can be shown that the operations both in the Differential 
Calculus and the Calculus of Finite Differences are subject to these laws, the Binomial 
Theorem may be at once assumed as true with respect to them, so that it is not 
necessary to repeat the demonstration of it for each case.” To prove Leibniz’s theorem, 
Gregory observed that 


La ue 5 ae 
— =u— = 
i dx dx 


He then rewrote this equation, as had Arbogast: 


7 
where £ acted on v but not on u and £ acted on u but not on v. Since these operations 
were independent of each other, they commuted, so that 


(i) w= (S8) 
dx cae dx dx A 
(S)(G) e+) 
_ tn fess | Uv 
dx dx dx 


d"y  d"ludu  n—1)d’ud"™v | 
Nax" "dent dx 


"1.2 dx2 dxn-2 


33 Gregory (1865) pp. 14-27, 108-123. 
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Gregory remarked that this result was true with n negative or fractional, or “in the 
cases of integration and general differentiation.’ He then took v = 1 andn = —1 to 
obtain Bernoulli’s formula, 


/ a as x? du x? dbu 
ON UE ae ee ae 


Using Arbogast’s E operator, Gregory derived a proof of the Newton—Montmort 
transformation, given by Euler in his 1755 differential calculus book. Suppose 


S=ax4 ayx> anx? a3x* pws. 


and aj = Ea,ay = E*a,a3 = Ea,.... We write E instead of Gregory’s D, since 
D might be confused with the derivative. Recall that E = 1 + A, where A is the 
difference operator; thus, Gregory derived the Newton—Montmort transformation: 
S=(x+x7E+x7° BE? +---)a 
=x(1—xE)'a=x(1—x—xA)!a 


II 
— 
J Je 
3 
fo 
— 
| Je 
tad 
a 
— 
ae 
= 
N 
N 
Nn 
Q 


| 
Es 
he 
ce 
> 
Q 
Sn 
_ 
| [es 
& 
ag 
iw) 
> 
NO 
g 
ee 
_ 
| | & 
= 
wa 
w 


Recall that we have discussed this formula earlier, as (10.3). 
Gregory also found an operational method for solving linear ordinary differential 
equations with constant coefficients. He began with the equation 


dq" 1 y dq? 
+ B t---+R + Sy = X, 

dxn ) dxnT ggnD TT Gy TY 

where X was a function of x. After separating “the signs of operation from those of 

quantity,” the equation became 


qd” q’-! q?-2 d 
tA + B Fee +R tS =X. 
(= ax! due ) y 


Note that this can also be written as 


d 
(Z)y=x, 


where f is a polynomial. Gregory’s problem was to find 


=U) * 
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and he first worked out the simplest case, where f(x) = 1+ x and X = 0. Gregory 


calculated 
d —l1 
= {14+ — 0 
(+3) 
as Neale ae gay 
( a) dx! ( - —) 
Ge lee 
= Seve 
( ax) gas? ) 
xe oa 
=e of (il ered Pe a oat 
( Pa. Tees ) . 
He noted that a = [ dx. Now note that if f(x) = a+ x, one would get y = ce™™. 


Gregory then observed that 


d 3 = aX” 
—:+ta) X=et™ | —) eX, 
dx dx 


provable by means of the binomial theorem. Gregory finally considered the general 


case: 
d d d d a 
dx “ dx ta dx ies dx se fe aa 


He applied ( £ — a) to both sides of the equation to find 
d d d d ae 
a a3 )---| —— ={—-a 
dx *) \ dx : Go pe dx ; 


Similarly, 


9) (8) 


d -1 
(+ -«2) ta ec 
x 
aa ea (/ e 1*X ax) dx 


em® fe“M*X dx — e* f e~2*X dx 


fas | 
aj — 4 a2 — a 


using integration by parts in the last step. Thus, Gregory’s final formula took 
the form 


emt f e~ XX dx 


~ (a1 — a2)(a1 — a3) +++ (a1 — Gn) 


ge | ee Kae 


(Qn — 41) (Gn — a2) +++ (An — Gy) 
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In 1811, Frangais had used the same method, going a step beyond Gregory by using 


partial fractions to decompose { tf ( £) Lh By this technique, 
d\)"! "ON; 
PG) Pag 
dx i=] ae aj 
where Frang¢ais assumed that the roots a1,a2,...,d, of f(x) = 0 were distinct. When 


the value of N; was substituted, the result was the same as Gregory’s. Francais also 
showed that his method could be extended to the case of repeated roots. 


21.8 Boole’s Operational Calculus 


In his 1844 paper “On a General Method in Analysis.’*4 Boole extended Murphy and 
Gregory’s symbolic method to treat problems on linear ordinary and partial differential 
equations with variable coefficients, linear difference equations, summation of series, 
and the computation of multiple integrals. He started his paper by stating several 
general propositions on functions of commutative and noncommutative operators. He 
made frequent use of some special cases and noted that they were already known: Let 
x =e. Thenx # = £=D, so 


f(D)e"u =e” f(D + m)u, (21.51) 

f(Dye"” = filme”, (21.52) 

D(D —1)-(D=—n+1)u =x" (+) u. (21.53) 
dx 


Though Boole did not explicitly say so, f(x) is a function expandable as a series. 
Relation (21.51) can be verified from the particular case f(D) = D”. In this case, by 
Leibniz’s formula for the nth derivative of a product, we can see that 


d" d . 
ae (e"?u) = el” (4 +m) u. 


The other two cases can be easily verified. 
Boole’s fundamental theorem of development was given by the formula 


fo(D)u + fi(Dye*u + fo(D)eu + -- 


=~ {(foon)um + filmum—1 + folm)um—2 +++ e™}, (21.54) 


34 Boole (1844b). 


21.8 Boole’s Operational Calculus 625 


where u = >\ume'’. He verified (21.54) by substituting the series for u on the 
left-hand side of the equation and applying (21.51). Boole applied his development 
theorem to the summation of series, noting that if the coefficients of u satisfied a 
linear recurrence relation 


fo(m)Um + fi(m)um—1 + +++ = 90, 


then (21.54) yielded a differential equation satisfied by wu. In cases where this 
differential equation could be solved in closed form, he had the sum of the series 
> unx". This method allowed him to use the recurrence relation satisfied by the 
coefficients of the series in order to quickly find the differential equation satisfied 
by the series. 
As an example of a series summation, Boole considered for any real n, 
n2 2 n2(n2 2 27) i n2(n2 _ 2?) (n? = 4?) P 


= ! Bites “OH 
i pe Teds a re?) 


In this case, 


iz eS ee 


Um = 
m(m — 1) 


Um—2- 


He could then immediately write the differential equation satisfied by the series as 


(D —2)? —n? 4 
D(D —1) 


u=l, 
or 


D(D — Iu — ((D — 2)? =n? )e*u = 0. (21.56) 


Applying (21.51) and (21.53), 


(D — 2)°e7%u = e?? D*u =e? (D(D —1) + D)u 


Thus, (21.56) was simplified to 


d*u du 
an hae nu =0. (21.57) 


(a2) 


Boole then substituted V1 — x2o = 4 or y= sin! x, to convert the differential 


equation (21.57) to rs + n?u = 0. This gave Boole the solution 


u =c ,cos(ny) + c2 sin(ny) = c; cos(n sin”! x) +c? sin(n sin~! x) 
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with the constants c; and c2 equal to 1 and 0, respectively. He therefore had 


2 2,2 2 
bases n n*(n* — 2°) 
cos(n sin yy =1 x? 4 ( : 


13 epee or ee 
or 
=e n Oh n2(n2 — 2?) 4 n2(n2 — 22)(n2 — 4?) IES 5 
cos(ny) = ry sin* y 4 7 sin” y rar sin-y+-:-. 
Similarly, Boole noted 
na? —?) 4 n@’ 1)? -3) . 5 
sin(ny) = nsin y 31 sin? y 4 51 sin- y+--- 


Recall that Newton discovered this series (8.16) and communicated it to Leibniz 
in his first letter of 1676. The series was afterward employed by Gauss to prove that 
T(iwrd—x) = Sx’ Boole also used his method to solve some linear differential 
equations with variable coefficients and considered even more complex equations 
requiring a somewhat more elaborate technique. A simple example may explain his 
basic method. Boole set out to solve a differential equation with variable coefficients, 


commenting that it occurred in the theory of the “Earth’s figure”: 


Xx x 


In his solution, Boole employed the general proposition:*° 


The equation u + @(D)e"’u = U will be converted into the form v + y(D)e™v = V, by the 
relations 
D D 
— p, 2! Ly pope ) 
w(D) yw(D) 


Vv, (21.59) 


wherein P, ae denotes the infinite symbolical product anh ae . 


Boole proved this by assuming u = f(D)v and substituting in the first equation in 
(21.59) to get 


f(D)v + b(D)e"” f(D)v = U. 
By (21.53) this became 
f(D)v + @(D) f(D — rev =U 


or 


gb (D)f(D—1) yg 4 
v4 (D) evv=(f(D))— U. 


35 ibid. 
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Thus 
$(D) f(D —r) 
J) SEN RE 
WD) = 
or 
o(D) $(D)o(D — r) 
LO py Gyan 


In general, Boole attempted to choose v such that it satisfied the equation q ae 


q"v = X. Boole applied his general proposition to rewrite (21.58) as 


2 
q 2 ee 


coca 


Now Boole required the equation for v to be T3 ¥ 4 @g2v = X,or 


2 


PE Da 1 


Here 


q° 2, 
IOV Oey DDS 


so that, by the general proposition, 


pO D1 
°W(D) D+2’ 
Thus, u = Sov and 0 = aot V. Boole could then take V =0 and v =c sin(gx+c}). 
Finally, 
PF y= (1-3 + 2)Yesin(gx + c1) 
u= —v=(1- csin(qx +c 
D+2 ores 


=c(1 —3e7 7° D!¢78) sin(qx + c}) 


eee bea ee: in(qx + c}) 
= = sin 
Cc mo) wer XxX qx Cl 


3 
=csin(¢x +cj) -—- = =| dx x sin(qx + c}) 


3 3 
PR =) sin(qx + c1) be — Ses + cv). 


II 

D 
aS 
| ero 
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In 1860, Boole wrote a book on the calculus of finite differences,°° in which he 
discussed operators methods at length. In particular, he employed the operator e? to 


express the series 


(x) —O@ +1) + 6% 4+2)—-O@4+3)4+-:- 


as an asymptotic series using Bernoulli numbers.*” He observed that 


p(x) — b(x +1) + o(x +2) — o(x +3) 4+--- 


= (1a eP $erP — OP +...) p(x) 
ete p(x) 
~ eD4] au 
He next noted that 
Loe 4 2 
eP+1 eP—-1 eD_] 
1 1 B B 1 1 B B 
= beep ape ea i OD Ae OD ass 
Di <2: 2! 4! 2D 2 2! 4! 
1 B B 
=5 x (2? —1)D4 Tn (24 — 1) D3 — ete. 


Thus, the formula he proved, now known as Boole’s formula, may be written as 


Co 


~ Box (27* — 1 
SY -Die@+ f= ow =o, 2k ( ) g2k-D Q), 
k=l 


! 
7 (2k)! 
As we have mentioned before, Boole’s formula was originally derived by Euler in 


a paper of 1736°° and then presented in several later papers*? and in his differential 
calculus book of 1755.*° 


21.9 Jacobi and the Symbolic Method 


In 1847, Jacobi wrote an interesting paper*! using the operational method to derive 
two results on transformations of series. He applied the second of these to the 
derivation of a result in the theory of hypergeometric series knows as Pfaff’s 
transformation. Jacobi apparently wished to bring attention to Pfaff’s important result. 
This wish was finally fulfilled in about 1970, when Richard Askey read Jacobi’s 
paper and made Pfaff’s work known to the mathematical community. Jacobi did not 
explain why he chose to explore the operational method. The work of the British 


36 Boole (2003). 

37 Boole (2003) pp. 101-102. 

38 Eu. 1-14 pp. 124-137, especially § 9-11. E55. 

39 Boole’s result in almost the same form: Eu. I-14 pp. 124-137, especially § 14. E55. Also E617. 
40 Eu. 1-109 § 179-183. E 212. 

41 Jacobi (1847). 
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mathematicians may have appealed to his algorithmic style; note that he had visited 
Britain in 1842. Both of these transformations had been earlier presented in Euler’s 
differential calculus book. Euler’s second formula (10.15) stated that if 


f(x) =a+bxtex*+dxi+--- 


then 


aAg + bA\x 4 cAnx? | dA3x° +--- 


df  A*Ao ,d*f  A3Aq 34°F 
=A + AA 
of) ie (oo. eo 10s de 


Jacobi’s proof of this formula was similar to Duncan Gregory’s proof of the 
Newton—Montmort formula, discussed earlier. Recall the Arbogast operator E used 
by Gregory: E* Ay = Ax. The formal steps of the argument were then 

aAg + bA\x + CApx? + dA3x° fe. 
= (a +bxE +cx7E? +dx° E> +---)Ag 
= f(xE)Ao = f(x +x(E — 1))Ao = f(x + xA)Ao 


= (rent sonra + LO 2a +. +) 40 


df A*Ao pit 
= + AA 
eae Per ar Fo 


To obtain Pfaff’s transformation, Jacobi specialized the sequence Ao, A1, A2, 
A3,... to 


, 2 BB+Y B(6B + 1)\(6 +2) 
GEE VG I(y +2)" 


and noted that the first and second differences were 


B-y B-y B B-y_ BB+)I) B-y BB+)E6+t2) 


yoy yt vy Y+EDVH2 yy YVEDY EDV 43) — 


and 


G—y)Go—y-1) G-yvyGB—y—-l) Bp 
yvt+l) yy +1) y+2’ 
(B-y)(B-y-1) BB+) 

yv+1) (y +2)(y +3) 


In general, he observed, 


(B-y)\(B-y—1)::-(B-y-m+1)-B(B+1)---Gt+n—1) 
vivyvtD(v+2)-:-~vtmtn-—1) ; 


A" A, = 
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In particular, when n = 0, he got 
m (a Oa ae 9 eta (ee des (ace ©, 
A" Ay = — 
POP Dam 1) 
Jacobi then set 


FES een ee a ee ee ae | wet Wet?) =o 


so that Euler’s transformation reduced to Pfaff’s transformation: 


1 SB OTD BB TD py Her DOT?) BBTDETZ) 3, 


ley 1-2-y(y+D 1-2-3-y(y+ Dy +2) 
a 1 QB HV ef CO TI BVOCs 7a) es. 
TE Saas Pe > tea 1-2-y(vtD (—x)2 ° 


21.10 Cartier: Gregory’s Proof of Leibniz’s Rule 


). 


In his 2000 paper “Mathemagics,” Pierre Cartier gives a rigorous version’? of 
Arbogast and Gregory’s argument for Leibniz’s rule. Cartier makes use of the tensor 
product V @ W of vector spaces V and W. The vector space V @ W consists of all 
finite sums )> A;(v; ® w;), where A; are scalars, v; € V and w; € W, and where v@w 
is bilinear in v and w. For the purpose at hand, let J be an interval on the real line 
and C™ (J) be the vector space of infinitely differentiable functions on J. Define the 


operators D; and Dz on C(I) ® C™® (1) by 


Di\(f 8g) = Df @g, Do(f @g)= f ® Dg. 


The two operators commute, that is, Dj D2 = D2 D ,. Now define 
D(f ®g)=Df @g+f@Dg, 


so that D = D, + D>; we can then conclude that 


D'(f®s=>) (;) D‘ f @ D"*g. 


k=0 


We can convert the tensor product to an ordinary product by observing that f - g is 


bilinear in f and g and hence there is a linear map 


2 CPU) @ CPU) > CPT) 


42 Cartier (2000). 


21.11 Hamilton’s Algebra of Complex Numbers and Quaternions 631 


such that u(f ® g) = f - g. The proof can now be completed: 


D" (fg) = D"(u(f ®g)) = uD (f ®g)) 


= @ u(D* f @ D"-*g) 


k=0 
n 
k=0 


Observe that Cartier succeeds in resolving the problem in Gregory’s presentation, that 
the operators D; and D2 do not apply to both f and g. 


21.11 Hamilton’s Algebra of Complex Numbers and Quaternions 


We have noted a strong algebraic spirit in the work of the British mathematicians 
of 1830-1850 and have observed important modern algebraic concepts in the work 
of Murphy. However, William R. Hamilton’s (1805-1865) algebraic work was 
thoroughly modern in its structure and presentation. In 1826, Hamilton’s friend 
J. T. Graves communicated to him some results on imaginary logarithms. This led 
Hamilton to formulate the theory of algebraic couples as the proper logical foundation 
for complex numbers. He finally presented this to the British Association in 1834.47 
Gauss also got these ideas around the same time. Hamilton defined complex numbers 
as a set of pairs of real numbers, called couples, with addition and multiplication 
defined in a special way. More generally, he determined the necessary and sufficient 
conditions for a set of couples to form a commutative and associative division algebra. 
Hamilton first defined the sum and scalar multiplication of couples: 


(bj, b2) + (ay, a2) = (b1 + a1, b2 + a2); 


a X (a},a2) = (aaj,aaz). 


He took the last equation as the first step toward the definition of the product of two 
couples by identifying the real number with a couple (a,0) to get 


(a,0) x (41,42) = (4,0) (41,42) = (a1, a2)(a,0) = (aa), aap). 
His aim was to define multiplication in order to satisfy the two conditions 


(bj + a1, b2 + a2) (1,2) = (b1,b2)(c1,€2) + (41,42) (C1, €2), (21.60) 


43 Hamilton (1835). 
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(c1,€2)(b1 + a1, bz + a2) = (€1,€2)(b1, 2) + (€1,€2) (a1, a2). (21.61) 
Now for this type of multiplication to be possible, he had to have 
(c1,€2)(@1,42) = (€1,0)(a1,a2) + (0,c2) (a1, a2) 
= (C141, C1a2) + (0,c2)(a1,0) + (0,c2)(0, a2) 


= (c1a1,C1a2) + (0,c2a1) + (0, c2)(0, a2) 
= (c1a1,c1a2 + c2a1) + (0,c2)(0, a2). (21.62) 


It remained to define the product (0,c2)(0,a2) = c2a2(0,1)(0, 1) contained in the 
last step. Hamilton set 


(0, 1) (0,1) = (11, v2) (21.63) 


and determined the necessary and sufficient condition on y; and y2 so that the two 
conditions (21.60) and (21.61) would hold: He supposed (b1, b2) to be the result of the 
product on the left-hand side of (21.62); then by equation (21.63) he had 


by = Cia, + Y1a2C2, 


bz = cja2 + c2a1 + y2a2¢2. 


Now to be able to solve these equations for aj,a2 when cic2 # 0, the necessary and 
sufficient condition was the nonvanishing of the determinant, where the determinant 
was given as 


2 1 : 1 2 2 
ci(c1 + y2€2) — yicz = er + 5 y202 = ME Ge C2. 


This expression was nonvanishing for all c)cz 4 0 if 


1 


2 
4% < 0. 


yt 


The case in which yj = —1, v2 = 0 gave Hamilton the usual multiplication rule for 
complex numbers: 


(by, b2)(a1,a2) = (bj, b2) x (a), a2) = (b1 a) — b2a2, bya, + bya). 


Further developing the theory of complex numbers, Hamilton showed that the 
principal square root of (—1,0) was (0, 1), and since (—1,0) could be replaced by —1 
for brevity, he obtained 


J/-1 = (0,1). 
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He then wrote** 


In the THEORY OF SINGLE NUMBERS, the symbol =I is absurd, and denotes an 
IMPOSSIBLE EXTRACTION, or a merely IMAGINARY NUMBER; but in the THEORY OF 
COUPLES, the same symbol J/—1 is significant, and denotes a POSSIBLE EXTRACTION, or 
a REAL COUPLE, namely (as we have just now seen) the principal square-root of the couple 
(—1,0). In the latter theory, therefore, though not in the former, this sign /—1 may properly be 
employed; and we may write, if we choose, for any couple (a},a2) whatever, 


(a},a2) =a, +anv-l,.... 


Hamilton next attempted to extend his work to triples, or triplets. His motivation 
was to obtain an algebra applicable to three-dimensional geometry and physics. He 
was well aware of the geometrical interpretation of complex numbers as vectors in 
two dimensions. Under this interpretation, the parallelogram law determined addition; 
moreover, the length (or modulus) of the product of two complex numbers turned 
out to be the product of the lengths of the two numbers. In October 1843, Hamilton 
described his train of thought as he worked toward his October 16 discovery of 
quaternions. He explained that he considered triplets of the form x + iy + jz 
representing points (x, y,z) in space. Here j was “another sort of /—1, perpendicular 
to the plane itself.’4> Addition and subtraction of triplets was a simple matter, but 
multiplication turned out to be a challenge: In a letter of 1865 to his son, Hamilton 
recalled:*¢ 


Every morning in the early part of the above-cited month, on my coming down to breakfast, 
your brother William Edwin and yourself used to ask me, ‘Well, Papa, can you multiply triplets?’ 
Whereto I was always obliged to reply, with a sad shake of the head, ‘No, I can only add and 
subtract them.’ 


In his 1843 description of his discovery, Hamilton recounted his dilemma:*’ 


In order to multiply triplets, term-by-term multiplication had to be possible and 

the modulus of the product was required to equal the product of the moduli. He 

observed that 

(at+iy + jz(x+iy+ jz) =ax—-y—2+iat+x)y+ jaatxz4+ Gj +t jiyz 
(21.64) 


and that 


(+ y44@+y 427) =@x—y— 274+ @tx)—y7? +2). 


So the rule for the moduli implied that the last term in (21.64) should be zero, or, 
ij+ji = 0. He was sufficiently audacious to consider the possibility thatij = ji = 0; 


44 ibid. 127-128. 

45 Hamilton (1945). 

46 Graves (1885) pp. 434-435. 
47 Hamilton (1945). 
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in modern terminology, this meant that i and j would be zero divisors. However, when 
he examined the general case 


(a+ib+ jc)(x +iy+ jz) = ax — by —cz+i(ay + bx) 


+ j(az+cx) + ij (bx — cy) 


and the corresponding formula for the moduli 


(apr C7) (x? + y? + 27) = (ax — by cz)* 4 (ay 4 bx)? 


+ (az +x)? + (bz — cy)’, 


he saw that the coefficient of ij, bz — cy, could not be dropped and hence ij could not 
be zero. Put even more simply, the moduli of the product ij had to be | and not 0. 

When he reached this result, it dawned on Hamilton that to multiply triplets, he 
must admit in some sense a fourth dimension, and he described this realization in a 
letter to his friend J. T. Graves, written the day after he discovered quaternions. By a 
remarkable coincidence, after completing this letter he came across the May 1843 
issue of the Cambridge Mathematical Journal containing a paper by Cayley on 
analytical geometry of n dimensions. In a postscript to his letter to Graves, Hamilton 
noted that he did not yet know whether or not his ideas were similar to Cayley’s. 

Continuing his description, Hamilton saw that he had to introduce a new imagi- 
nary & such that ij = k. Thus, he discovered quaternions! Moreover, ji = —ij = —k. 
He wondered whether k* = 1. But this produced the equation 


(a+ib+ jo+kd)(a+ip + jy +ké) 


=aa—bB—cy+d5+i(aB+:--)+ jlay +---)+k(ad + da 


wm 


implying that 


(24+ RP 424 @)(02 + Br +y? +8) 


= (aa — bB — cy + db)? + (4B +--+)? + (ay +++°)? + (GS +da+---). 


Of course, this relation could not possibly hold, because the term 2aad6 in the first 
square on the right-hand side would not cancel the same term in the last square. So 


Hamilton took k* = —1. He then supposed that associativity would probably hold true 
and hence —j = (ii) j = i@ij) = ik. Similarly, j (ii) = (ji)i = —ki or j = ki, and 
so ik = —ki. In this way he obtained the basic relations: 


Z=pP=R=-1,ij=k, jk =i, ki=j, ji=—k, kj = —i, ik=—j. 


The product of two quaternions then emerged as 


(a+ib+ jo+kd)\(a+iB+ jy +ké) 
= aa —bB —cy —dé+i(aB+ba+cé — dy) 
+ j(ay —bd+ca+dB)+k(aé + by — cB + da). 
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And of course, with this definition, 
(a? +b +c? +d’)? Bp y 5°) 
= (aa — bB — cy — dd) + (aB + ba + cd — dy)” 
+ (ay — bd + ca + dB)” + (ad + by — cB +. da)’. 


Thus, the modulus of the product equaled the product of the moduli! Interestingly, 
Euler also knew this identity in 1748, in connection with the representation of a 
number as a sum of four squares.*® Hamilton concluded his description of this 
discovery:*? 


Hence we may write, on the plan of my theory of couples, 


(a,b, c,d) (a, B,y,8) = 
(aa — bB —cy —dé6, aB + ba + cd — dy, ay — bi + ca +dB,ab+ by — cB + da). 


Hence 
(a,b,c,d)* = (a2 — b? — c? — d”, 2ab,2ac,2ad). 
Thus 
(0,x,y,2)" = —(e* + y* +27); Ox, 9,2)? = —@? $y? +27) 0.x, 9,25 
(0,x,y,z)4 = +(x? + y? + 27); &e. 
Therefore 


O292 epee <4 ee Ee 
al aoe alas ee 1 1-2 


, : k 
ee ety ee ec ey ee 
Vx2+ y2 +22 
and the modulus of e*+¥:2) = 1. [Like the modulus of e*) or ev—1x] Let 


Vx2+y2+22 = p,x = pcos, y= psindcosy, z = psingsiny; 


&C; 


then 
epi cos P+/ sing cos ¥+k sin ¢ sin w) 


=cos p+ (icos@+ jsingdcosw+ksing sin y) sin p; 
a theorem, which when ¢ = 0, becomes the well-known equation 


e? = cos p+isin p, i=V—1. 


48 Fuss (1968) vol. 1, p. 452. See also Eu. I-2 pp. 338-372, especially pp. 368-369. E 242 § 93. 
49 Hamilton (1945). 
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Hamilton’s letter led John T. Graves in December 1843 to produce an eight- 
dimensional division algebra, the algebra of octaves or octonions. The law of moduli 
was maintained within this system, so that 


(aj tay +--+ +ag(bp +by +--+ +b: =cp tp +-- +e. 


Hamilton observed that while associativity held for quaternions, it failed to hold for 
octonions. Graves did not publish his work, though Cayley rediscovered and published 
it in 1845. Octonions are therefore called Cayley numbers. 

The German mathematician A. Hurwitz wrote that Hamilton’s two requirements, 
that term-by-term multiplication be valid and that the product of the moduli be 
equal to the moduli of the product of n-tuples (11, x2,...,%n), in fact held only 
for n = 1, 2,4, 8. This explains why Hamilton was unable to discover a way of 
multiplying triplets. In the 1870s, C. S. Peirce and Frobenius gave another explanation 
for Hamilton’s failure to work out a three-dimensional division algebra, i.e., an algebra 
of triplets. They proved that the only real finite-dimensional associative division 
algebras were: the real numbers, the complex numbers, and the quaternions. 

We have seen that Hamilton was initially hesitant to move to the fourth dimension 
and was struck by Cayley’s work outlining a geometry of n dimensions. Note Felix 
Klein’s telling remark on George Green’s 1835 paper concerning the attraction of 
an ellipsoid, “This investigation merits special mathematical interest ... because it is 
carried out for n dimensions, long before the development of n-dimensional geometry 
in Germany began.”>° Such was the influence of the formal algebraic approach taken 
by British mathematicians of the early 1800s, that even the applied mathematician 
George Green was willing to consider the novel concept of an n-dimensional 
space. 


21.12 Exercises 


(1) Solve the differential equation 


I @ey 
Dida 


1 dty d®y 
Al dx4 = dx 


O=y 


using Euler’s method, by which he obtained 


(2Qn+1)ax (2Qn+1)ax 
v=o (ue 2 + bre 2 \ 


See section 50 of Euler’s paper E 62. 


50 Klein (1979) p. 217. 
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(2) Prove Faa di Bruno’s formula for the mth derivative of a composition of two 


functions: 


d™ 
qm SF) 


S mw ro me gcayy — (om cay\" 
pas (f(t)) 7 : mi ; 


where the sum is over different solutions of bj + 2b2 + +--+ mb,» = m and 
k = bi + bo+---+b,». Faa di Bruno gave the right hand side in the form of a 
determinant: 


(5) f'g are 1) fg Ga) fll'g _ (n—2) peas =) f™Me 
4 (m2 ) f'g (7) Fie Py. (a Ae aa 2g (ome 5) fim Deg 
0 a | er f'g Ce 4) Fae 3g (2 ) pee 2) g 
—1 , ; 
0 0 0 ..  Ofe (1) fg 
0 0 0 ese 1 (9) f'g 


where f = f(t) and g* = g(f(t)). Faa di Bruno published this formula 
without proof or reference in 1855 and then again in 1857. The formulation 
as a determinant appears to be original with Faa di Bruno, who may also be 
the only mathematician to be beatified by the Catholic Church. The papers of 
Craik (2005) and Johnson (2002) contain a detailed history of Faa di Bruno’s 
formula. 


(3) Solve the difference equation 
= 4u? (ur +1)=0. 
Herschel’s hint for the solution is to set uv, = /—1 sin v,. See Herschel (1820) 
p. 34. 
(4) Solve the difference equation 
Ux41Ux — Ay(Ux41 — Uy) +1 = 0. 
Herschel remarked that this was a slight generalization of the equation worked 
out in our text. See Herschel (1820) pp. 36-37. 
(5) Sum the series u = > ue 1 a ve +--+ . See Boole (1844b) p. 264. 


(6) Using Boole’s notation given in the text, prove his proposition: The equation 


u+@(D)e"u =U 
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will be converted to the form 
v+w(D)ev = V, 


by the relations u = e”*v and U = e”°V. See Boole (1844b) p. 247. 
(7) Let d, denote the derivative with respect to x. Note that [f(x + y)](dy — 
d,) = 0. Apply Murphy’s method for the difference operator to the differential 
in order to obtain the Taylor series for f(x + y). See Murphy (1837) p. 196. 
(8) Sum the series 


id 1 
> arctan ————n 
l+n+n2 


n=1 


See Herschel (1820) p. 57. 


21.13 Notes on the Literature 


Friedelmeyer (1994) gives an extensive discussion of Arbogast and his work. Babbage 
(1961), edited by Morrison and Morrison, contains papers of Babbage, including one 
discussing the Analytical Society. Enros (1983) is a nice discussion of the Analytical 
Society. Becher (1980) is an interesting article on Woodhouse, Babbage, and Peacock. 
For the early nineteenth-century work on operational calculus in Britain, see Allaire 
and Bradley (2002). 

See also articles by E. L. Ortiz and of S. E. Despeaux in Gray and Parshall (2007) 
and the paper of Koppelman (1971) for the role of the operational method in 
the development of abstract algebra. For remarks on the influence of the German 
combinatorial school on Gudermann and Weierstrass, see Manning (1975). 


ZZ 


Trigonometric Series after 1530 


22.1 Preliminary Remarks 


At the end of his 1829 paper on Fourier series, Dirichlet pointed out that the concept of 
the definite integral required further investigation if the theory of Fourier series were 
to include functions with an infinite number of discontinuities.! In this connection, 
he gave the example of a function ¢(x) defined as a fixed constant for rational x and 
another fixed constant for irrational x. Such a function could not be integrated by 
Cauchy’s definition of an integral. Dirichlet stated his plan to publish a paper on this 
topic at the foundation of analysis, but he never presented any results on it, though he 
gave important applications of Fourier series to number theory. 

Bernhard Riemann (1826-1866), a student of Dirichlet, took up this question as he 
discussed trigonometric series in his Habilitation paper of 1854.” The first part of the 
paper gave a brief history of Fourier series, a topic Riemann studied with Dirichlet’s 
help. In the later portion, Riemann briefly considered a new definition of the integral 
and then went on to study general trigonometric series of the form 


1 (oe) 
5 dag+ SoGn cosnx + by, sinnx), 


n=1 


where the coefficients a, and b, were not necessarily defined by the Euler—Fourier 
integrals. Using these series, he could represent nonintegrable functions in terms of 
trigonometric series. Here he introduced methods still not superseded, though they 
have been further developed. He associated with the trigonometric series a continuous 
function F (x) obtained by twice formally integrating the series. Riemann then defined 
the generalized second, or Riemann—Schwarz, derivative of F(x) as 


_ A*F(x—h) 
lim -—- 
h—0 h2 


! Dirichlet (1969) vol. 1, pp. 117-132, especially p. 132. 
2 Riemann (1990) pp. 259-296. 
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and proved that if the trigonometric series converged to some f(x), then the Riemann— 
Schwarz derivative was equal to f(x). In addition, he proved that if a, and b, tended 
to zero, then 


_ A*F(x—h) 
lim en 


0. 22.1 
h->0 h ( ) 


Riemann’s paper also contained a number of very interesting examples, raising 
important questions. Perhaps he did not publish the paper because he was unable to 
answer these questions; Dedekind had it published in 1867 after Riemann’s premature 
death. 

The publication of this paper of Riemann led Heinrich Heine (1821-1881) to ask 
whether more than one trigonometric series could represent the same function. He 
applied Weierstrass’s result that when a series converged uniformly, term-by-term 
integration was possible. Weierstrass taught this theorem in his lectures in Berlin 
starting in the early 1860s, though he had discovered it two decades earlier. From 
this result, Heine concluded that a uniformly convergent trigonometric series was a 
Fourier series; he defined a generally uniformly convergent series to cover the case of 
the series of continuous functions converging to a discontinuous function. Such series 
converged uniformly on the intervals obtained after small neighborhoods around the 
discontinuities had been removed. In a paper of 1870, Heine stated and proved that a 
function could not be represented by more than one generally uniformly convergent 
trigonometric series.* 

When Georg Cantor (1845-1918) joined Heine at the University of Halle in 1869, 
Heine awakened his interest in this uniqueness question. Cantor had studied at the 
University of Berlin under Kummer, Kronecker, and Weierstrass and wrote his thesis 
on quadratic forms under Kummer. At Heine’s suggestion, Cantor studied Riemann’s 
paper containing the observation, without proof, that if 


dn cosnx + b,sinnx >0 as noo 


for all x in an interval, then a, — 0 and b, — 0 as n — of. Cantor’s first paper on 
trigonometric series, published in 1870,* provided a proof of this important assertion, 
now known as the Cantor-Lebesgue theorem. This was the first step in Cantor’s proof 
of the uniqueness theorem that if two trigonometric series converge to the same 
sum in (0,27) except for a finite number of points, then the series are identical. 
Note that Henri Lebesgue (1875-1941) later proved the theorem in a more general 
context. 

To prove his theorem, Cantor needed to show that if the generalized second, or 
Riemann—Schwarz, derivative of a continuous function was zero in an interval, then 
the function was linear in that interval. So in a letter of February 17, 1870, he asked 
his friend Hermann Schwarz for a proof of this result. Schwarz had received his 


3 Heine (1870). 
* Cantor (1870a). 


22.1 Preliminary Remarks 641 


doctoral degree a few years before Cantor, but they had both studied under Kummer 
and Weierstrass at Berlin. Schwarz left the University of Halle in 1869 and went to 
Zurich, but they corresponded often. In fact, Schwarz wrote to Cantor on February 25, 
1870, “The fact that I wrote to you at length yesterday is no reason why I should not 
write again today.” In this letter Schwarz gave what he said was the first rigorous proof 
of the theorem that if a function had a zero derivative at every value in an interval, then 
the function was a constant in that interval.° 

Schwarz provided a proof of the result Cantor needed for his uniqueness theorem. 
Cantor next studied the case with exceptional points, at which the series was not 
known to converge to zero. Was the value of every coefficient still zero? He supposed c 
to be an exceptional point in an interval (a, b) so that the series converged to zero in the 
intervals (a,c) and (c,b). Now Riemann’s second theorem, given by (22.1), implied 
that the slopes of the two lines had to be the same, and hence F(x) was linear in (a, b), 
and the uniqueness theorem followed. Clearly, the argument could be extended to a 
finite number of exceptional points. When Cantor realized this, he asked whether there 
could be an infinite number of exceptional points; he soon understood that even if the 
exceptional points were infinite in number, as long as they had only a finite number of 
limit points, his basic argument would still be effective. 

Leopold Kronecker (1823-1891) was initially quite interested in the work of Cantor 
on the uniqueness of trigonometric series. After the publication of Cantor’s first 
paper, Kronecker explained to him that the proof of the Cantor-Lebesgue theorem 
could be simplified by means of an idea contained in Riemann’s paper. However, 
as Cantor’s work progressed and he began to use increasingly intricate infinite sets, 
Kronecker lost sympathy with Cantor’s ideas and became a passionate critic of the 
theory of infinite sets. Cantor, on the other hand, abandoned the study of trigonometric 
series and after 1872 became more and more intrigued by infinite sets, at that time 
completely unexplored territory. Luckily, Cantor found an understanding and kindred 
spirit in Dedekind, who had himself done some work on infinite sets. Cantor started a 
correspondence with Dedekind in 1872 that continued off and on for several years.° 
Dedekind helped Cantor write up a concise proof of the countability of the set of 
algebraic numbers, and in 1874 this theorem appeared in Cantor’s first paper on 
infinite sets. 

Though there was some opposition to Cantor’s theory, it was directly and indirectly 
successful as sets became basic objects in the language of mathematics. Without this 
concept, such early twentieth-century innovations as measure theory and the Lebesgue 
integral would hardly have been possible. These advances in turn had consequences 
for the theory of trigonometric series and the theory of uniqueness of such series. As 
an example, consider the noteworthy theorem of W. H. Young from a 1909 paper: 
“Tf the values of a function be assigned at all but a countable set of points, it can be 
expressed as a trigonometric series in at most one way.”’ 


5 For an English translation, see Meschkowski (1964), pp. 87-89. 
6 See Ferreirés (1993) for references. 
7 Young (1909). 
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22.2 The Riemann Integral 


In his 1854 paper on trigonometric series,® Riemann observed that since the Euler— 
Fourier coefficients were defined by integrals, he would begin his study of Fourier 
series with a clarification of the concept of an integral. To understand his definition, 
let f be a bounded function defined on an interval (a,b) and leta = xg < x1 < 
XQ <+++ < Xn-1 < Xn = b. Denote the length of the subinterval x, — xx-1 by dx 
where k = 1,2,...,n. Let 0 < e&, < 1 and set 


s = 6) f(a +161) + d2 f (x1 + €282) + 43 f (x2 + €353) + +++ + bn f Xn—-1 + €ndn). 


Riemann noted that the value of the sum s depended on 4, and €;, but if it approached 
infinitely close to a fixed limit A as all the 5s became infinitely small, then this limit 
would be denoted by a i f (x) dx. On the other hand, if the sum s did not have this 


property, then / is J (x) dx had no meaning. 

Riemann then extended the definition of an integral to include unbounded func- 
tions, as Cauchy had done. Thus, if f(x) was infinitely large at a point c in (a,b), 
then 


b ca b 

i f(x) dx = lim i f(x) dx + lim f(x) dx. 
a aj>0 Jaq a2>0 Setar 
Riemann next raised the question: When was a function integrable? He gave his 
answer in terms of the variations of the function within subintervals. He let D; denote 
the difference between the largest and the smallest values of the function in the interval 
(xx-1,Xx) for k = 1,2,...,n. He then argued that if the function was integrable, 
the sum 


do = 81 D1 + 82D2 +++» + bn Dn 


must become infinitely small as the values of 5 became small. He next observed that 
for dg < d(k = 1,...,n), this sum would have a largest value, A(d). Moreover, 
A(d) decreased with d and A(d) + O asd — O. He noted that if s were the total 
length of those intervals in which the function varied more than some value o, then 
the contribution of those intervals to }> was > os. Thus, he arrived at 


os <56,D, + 62D2+---+6nDn < A 
or 


ss 


Q| > 


From this inequality, he concluded that for a given o, 4 could be made arbitrarily 
small by a suitable choice of d and hence the same was true for s. Riemann could 
then state that a bounded function f(x) was integrable only if the total length of the 


8 Riemann (1990) pp. 259-296. 
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intervals in which the variations of f(x) were > o could be made arbitrarily small by 
a suitable choice of d. He also gave a short argument proving the converse. Riemann’s 
proof omitted some details necessary to make it completely convincing. In fact, in an 
1875 paper presented to the London Mathematical Society, H. J. S. Smith formulated 
a clearer definition of integrability and a modified form of Riemann’s theorem.” 

Riemann gave a number of interesting examples of applications of this theorem, 
remarking that they were quite novel. For instance, he considered the function defined 
by the series 


PG) Sg a (22.2) 


@) 2x) Gx) Ax) 
a” os 


where (x) was the difference between x and the closest integer; in the ambiguous case 
when x was at the midpoint between two successive integers, (x) was taken to be zero. 
He showed that for x = £ where p and n were relatively prime, 


IU IT 


1 
rate Jar. 


tot: )= foot 22, 
om Pas 16nn’ 


1 
f(x +0) = f(x) 5 (14 
nn 


1 
f@=—0)= f@4 aa (1 


at all other values of x, f(x) was continuous. Riemann applied his theorem to show 
that, although (22.2) had an infinite number of discontinuities, it was integrable over 


(0,1). 


22.3 Smith: Revision of Riemann and Discovery of the Cantor Set 


Henry Smith did his most notable work in number theory and elliptic functions, 
but his 1875 paper “On the Integration of Discontinuous Functions” also obtained 
some important results later found by Cantor. Though continental mathematicians did 
not notice this paper, it anticipated by eight years Cantor’s construction of a ternary 
set.!° In order to reformulate Riemann’s definition and theorem on integrability, Smith 
efficiently set up the modern definition of the Riemann integral in terms of the upper 
and lower Riemann sums, politely pointing out the gap in Riemann’s work:!! 


Riemann, in his Memoir... , has given an important theorem which serves to determine whether 
a function f(x) which is discontinuous, but not infinite, between the finite limits a and b, does or 
does not admit of integration between those limits, the variable x, as well as the limits a and b, 
being supposed real. Some further discussion of this theorem would seem to be desirable, partly 
because, in one particular at least, Riemann’s demonstration is wanting in formal accuracy, and 
partly because the theorem itself appears to have been misunderstood, and to have been made the 
basis of erroneous inferences. 


9 Smith (1875). 
10 Smith (1965a) vol. 2, pp. 94-95. 
11 ibid. pp. 86-89. 
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Let d be any given positive quantity, and let the interval b — a be divided into any segments 
whatever, 6, = xj —a, 62 = x2—Xxy, ..., dn = b—Xy_1, subject only to the condition that none 
of these segments surpasses d. We may term d the norm of the division; it is evident that there 
is an infinite number of different divisions having a given norm; and that a division appertaining 
to any given norm, appertains also to every greater norm. Let €1,€2,...,€n be positive proper 
fractions; if, when the norm d is diminished indefinitely, the sum 


S = 6, flat €151) + 2 fx + €252) +--+ + bn fn—-1 + €ndn) 


converges to a definite limit, whatever be the mode of division, and whatever be the fractions 


€1,€2,-.--,€n, that limit is represented by the symbol [ e Ff (x)dx, and the function f(x) is said to 
admit of integration between the limits a and b. We shall call the values of f(x) corresponding to 
the points of any segment the ordinates of that segment; by the ordinate difference of a segment 
we shall understand the difference between the greatest and least ordinates of the segment. For any 
given division 61,62, ...,6n, the greatest value of S$ is obtained by taking the maximum ordinate 
of each segment, and the least value of S by taking the minimum ordinate of each segment; if D; 
is the ordinate difference of the segment dj, the difference 6 between those two values of S is 


6 = 6,D, + 469D2 +--+ +6nDn. 


But, for a given norm d, the greatest value of S, and the least value of S, will in general result, 
not from one and the same division, but from two different divisions, each of them having the 
given norm. Hence the difference © between the greatest and least values that S can acquire for 
a given norm, is, in general, greater than the greatest of the differences 0. To satisfy ourselves, in 
any given case, that S converges to a definite limit, when d is diminished without limit, we must 
be sure that © diminishes without limit; and it is not enough to show (as the form of Riemann’s 
proof would seem to imply) that 6 diminishes without limit, even if this should be shown for 
every division having the norm d. 


With this revised definition of the integral, Smith was in a position to restate Riemann’s 
condition for integrability: 


Let o be any given quantity, however small; if, in every division of norm d, the sum of the 
segments, of which the ordinate differences surpass o, diminishes without limit, as d diminishes 
without limit, the function admits of integration; and, vice versa, if the function admits of 
integration, the sum of these segments diminishes without limit with d. 


Recall that Cantor was led to his theory of infinite sets through his researches in 
trigonometric series, and these in turn had their origins in Riemann’s paper. This 
paper also inspired mathematicians to investigate the possibility of other peculiar or 
pathological functions and to construct infinite sets with apparently strange properties. 
In 1870 Hermann Hankel, a student of Riemann, constructed infinite nowhere-dense 
sets and he gave a flawed proof that a function with discontinuities only on a nowhere 
dense set was integrable. However, Hankel succeeded in proving that the set of points 
of continuity of an integrable function was dense. 

Smith was the first to notice the mistake in Hankel’s proof; to begin to tackle this 
problem, he divided the interval (0,1) into m > 2 equal parts where the last segment 
was not further divided. The remaining m — | segments were again divided into m 
equal parts with the last segments of each left undivided. This process was continued 
ad infinitum to obtain the set P of division points. Smith proved that P was nowhere 
dense; he called them points “in loose order.” The union of the set P and its limit points 
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is now called a Cantor set since in 1883 Cantor constructed such a set with m = 3. 
Smith showed that after k steps, the total length of the divided segments was (1 — Ais 
so that as k increased indefinitely, the points of P were located on segments occupying 
only an infinitesimal portion of the interval (0, 1). He then applied Riemann’s criterion 
for integrability to show that bounded functions whose discontinuities occur only at 
points in P would be integrable. 

With a slight modification of this construction, Smith showed that there existed 
nowhere dense sets of positive measure. The first step in his modification was the 
same as before. In the second step he divided the m — 1 divided segments into m 
parts, but did not further divide the last segment of each of these. The (m— 1)(m? — 1) 
remaining segments were divided into m? parts, and so on. After k steps, Smith found 
the total length of the divided segments to be (1 = +) (1 — =) vee (1 = _"). He 


m 
noted that the limit []7°, (1 = =) was not equal to zero. He again proved that 


the set of division points Q was nowhere dense but that in this case a function with 
discontinuities at the points in Q was not integrable. Smith then noted,'* “The result 
obtained in the last example deserves attention, because it is opposed to a theory of 
discontinuous functions, which has received the sanction of an eminent geometer, 
Dr. Hermann Hankel, whose recent death at an early age is a great loss to mathematical 
science.” 

In his thesis of 1902,!> Lebesgue proved that a bounded function was Riemann 
integrable if and only if the set of its discontinuities was of measure zero.'+ Smith 
would perhaps not have been surprised at this result. 


22.4 Riemann’s Theorems on Trigonometric Series 


In his 1854 paper, after defining the integral, Riemann also investigated the question 
of whether a function could be represented by a trigonometric series without assuming 
any specific properties of the function, such as whether the function was integrable. Of 
course, if a function is not integrable it cannot have a Fourier series. Thus, Riemann 
focused on series of the form 


1 
= 70 + (a1 cos x + b; sinx) + (a2 cos 2x + bp sin2x)+--- 
=Aog+Ai+Ag+:::, 


where Ap = % and forn > 0 


An = Gn cosnx + by sinnx. 


12 ibid. p. 95. 
13 Lebesgue (1902). 
14 See Hawkins (1975) p. 127. 
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He assumed that A, — O as n — o and he associated with Q a function F(x) 
obtained by twice formally integrating the series (2. Thus, he set 
XX A2 A3 


C+C'x + Ao 5 Aj 4 9 +++ = F(x) (22.3) 


and proved F(x) continuous by showing that the series was uniformly convergent, 
though he did not use this terminology. He then stated his first theorem on F(x): 


If the series Q converges, 


F(x+a+B)—Fa+a-—p)— F(x -—a+ B)+ Fx —a — B) 
4ap : 


(22.4) 


converges to the same value as the series if w and 6 become infinitely small in such a way that 
their ratio remains finite (bounded). 


By using the addition formula for sine and cosine, Riemann saw that expression (22.4) 
reduced to 


sina sin sin 2a sin2 sin 3a@ sin3 
eas ere away lag Be Bias 


B 2a 28 3a = 3B 
When a = £, he had the equation 
F(x + 2a) —2F(x) + F(x — 2a) ees sina \7 bas sin 2a \7 oh 
4aa a 2a 


(22.5) 


Riemann first proved this theorem for the a = case and then deduced the general 
case. Observe that as a — 0, the series (22.5) converges termwise to Q. Thus, 
Riemann’s task was essentially to show that (22.5) converged uniformly with respect 
to a. We follow Riemann in detail, keeping in mind that Riemann did not use absolute 
values as we would today. Suppose that the series converges to a function 

f (x). Write 


Ao + Ai +--+ +An-1 = f@®)+& (22.6) 
so that 

Ao = f(x) +e, and Ay = €n41 — €n. (22.7) 
Riemann noted that, because of convergence, for any positive number 4, there existed 


an integer m such that €, < 6 for n > m. By (22.7) and using summation by parts, he 
concluded that 


= sinna \7 Mas sin(n — l)a z sinna \” 
Yan 5 ) noe (( Fie ) ( =a 3) (22.8) 
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He then took a to be sufficiently small, so that ma < z and let s be the largest integer 
in 7. He divided the last sum into three parts: 


The first sum was a finite sum of continuous functions, and it could be made arbitrarily 
small by taking @ sufficiently small. In the second sum, the factor multiplying €, was 
positive and hence the sum could be written 


sinma \” sin sa \” 
<6 , 
ma Sa 
Note that Riemann assumed 4 < @ < m for any n in the second sum, although he 


did not explicitly mention this. To show that the third sum could be made arbitrarily 
small, he rewrote the general term as the sum of 


sina — l)aw\? — /sin(n — 1)w\* 
En and 
(n—l)a na 
sin(n — l)a sinne \7 sin(2n — l)a sina 
En = —-€, 5 ‘ 
na na (na) 
It was then clear that the general term in the third sum was less than 


1 1 1 
6 a) . 
(n—1)?2ae  nnaa nna 


Thus, the third sum was less than 


s( +=) 
(sa)? sa) 


Then, for infinitely small a, this expression became 


aa) 
= oe 
UIT a 


Riemann concluded that the infinite series on the right-hand side of (22.8) could not 


be greater than 
1 1 
(1 ote le =z). 
un 


so that the theorem was proved. Riemann’s argument can be shortened by observing 
that the second and third sums, in absolute value, are together less than 
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oo na sD ee) ay 
d t d t 
5 J a = dt <5 / = az dt. 
Pieri (n—l)a dt t 0 dt t 


Since the last integral is convergent, the result follows. To prove the general result, 
when a + §, Riemann set 


F(x +a+p)—2F(x) + FQ —a— B) = (@ +B) (f (x) + 81), 
F(x +a— p)—2F(x) + F(x —a@ + B) = (a — B)’(f (x) + 62), 


so that 


F(x +a+ B) — Fx +a—B)— F(x—a +B) + Fx —a— 8) 


4ap 
7 EB om py 
= f(x)4 dep rT Aap 62. 


The special case ~« = B implied that 6; and 52 became small as a and f got small. 
(a+) 


2 
Moreover, the factors and (e-B ” remained bounded when B was bounded. 
This proved the general case. Observe that the limit 


_ F(x th)+ F(x —h)—2F (x) 
lim 
h->0 h2 


is called the Schwarz, or Riemann—Schwarz, derivative of F. Riemann called this the 
“second differential quotient.” Note that 


F(x +h) + F(x —h) —2F(x) = A?F(x —/h), 
where AF (x —h) = F(x) — F(x — A). 


In general, F(x) is continuous, as Riemann proved, but not necessarily differ- 
entiable. So here we have an instance of a generalized second derivative, although 
Riemann did not express himself in those terms. 

Riemann’s second theorem stated that when A, — 0 as n — ov, then 


F(x + 2a) + F(x — 2a) — 2F (x) 
2a 


tends to 0 as @ tends to 0. In his terse style, Riemann gave a succinct argument for 
this, along lines similar to his proof of his first theorem. In his The Apprenticeship 
of a Mathematician, André Weil wrote that both he and his sister Simone found 
great value in the works of great minds and that he was very lucky to start off 
his mathematical reading of the greats with Riemann; he found that Riemann’s 
works “are not hard to read, as long as one realizes that every word is loaded with 
meaning; there is perhaps no other mathematician whose writing matches Riemann’s 
for density.”!> 


15, Weil (1992) p. 40. 
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22.5 The Riemann—Lebesgue Lemma 


The Riemann—Lebesgue lemma states that if f(x) is integrable over (a,b), then as 
t>o, 


b b 
/ f(x) costx dx > 0, and y f(x) sintx dx > 0. 
a a 


Note that this result implies that the nth Fourier coefficients of an integrable function 
tend to zero as n — oo. Riemann derived his lemma from his integrability condition 
in an interesting way. Again in his 1854 paper, he began by writing 


on n 2k 
n 
f(x) sinnx dx = ) is ' f(x) sinnx dx. 
0 EMR STE 
k=1 n 


He noted that sinnx was positive in the first half of the subinterval (24%, ohn. 


and negative in the second half. He supposed that in the whole subinterval he had 
me < f(x) < Mx, where M; was taken to be the largest value of f(x) in the 
subinterval and m,; the least. We may assume these to be the least upper bound and 
greatest lower bound, respectively. Thus, in the first half of the subinterval, 


(2k—1)x (2k—l)x 


n n 
i f(x) sinnxdx < Myf sinnxdx = 
Wk—L)a 2(k—1)a 
n 


n 


2Mx 


Similarly, in the second half of the subinterval, the integral would be less than — arn 


It followed that 


2k 
Tn 2 
f(x) sinnxdx < —(My — mx) 
2(k-1)a n 
n 


and hence 


n 2 1 n 
< Dok — my) = — YT De, 


k=1 k=1 


20 
/ f(x) sinnx dx 
0 


where 6, was the length of the kth subinterval and D; was the variation of f(x) on 
that interval. By his own definition of integrablility of f(x), the sum )° 6, Dx, had to 
become infinitely small as n became infinitely large. This proved the theorem. Observe 
that the definition of integrablility was perfect for obtaining this result on the Fourier 
coefficients, leading some to speculate that Riemann fashioned the definition with this 
result in mind. 


22.6 Schwarz’s Lemma on Generalized Derivatives 


Recall that in connection with his work on trigonometric series, Cantor in 1870 asked 
Schwarz whether the following result was true: If F(x) is continuous in an interval 
a<x<band 
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. F(x«ta)—2F(*)+ F(x -a) 
lim = 
a0 aad 


0 (22.9) 


for all x in the interval, then F(x) is a linear function. Schwarz replied to this letter 
that this was indeed a theorem and provided a proof, republished twenty years later 
in his collected mathematical works.!© Cantor used the theorem and gave the proof, 
credited to Schwarz, in his 1870 paper. 

Now note that if F is twice differentiable, then its second derivative and generalized 
second derivative are identical; moreover, by (22.9), F(x) is linear. Briefly, Schwarz’s 
proof of the general case began by setting 


(x) =|F(x) — F(a) ; — : (F(a) — F(b)) kx a)(b—x), (22.10) 
where k was a positive quantity to be chosen later. Schwarz did not employ the 
absolute value sign, instead using an €, equal to plus or minus 1, as a factor to maintain 
a positive value. Observe that @(a) = 0 and ¢(b) = 0. If the expression inside the 
absolute value sign in (22.10) is zero for all x ina < x < b, then F(x) is a linear 
function. Suppose the value of the expression is not zero. Since ¢(x) is continuous, it 
has a maximum at some point xo. Take k sufficiently small that the value of (xo) is 
positive. By the definition of maximum, 


p(x +a) — b(x0) SO and (xo — a) — P(x0) < 0; 


thus, 
(xo + &) — 26(x0) + O(X%0 — @) < 0. 
But 
ki (xo + a@) — 26(x0) + O(X0 — &) 
im 
a—>0 ad 
= lim (“= a) een) eee Ge k) =k>0. 
a>0 ad 


This contradiction implies that F(x) is a linear function. Note that Weierstrass is 
credited with the 1841 invention of the absolute value sign we use today. 


22.7 Cantor’s Uniqueness Theorem 


Cantor first stated his uniqueness theorem in 1870, though he later gave generaliza- 
tions.!’ His first theorem stated that if a trigonometric series 


1 CO 
5 ag+ SoG cos nx + by, sinnx) 


n=1 


16 Schwarz (1972) vol. 2, pp. 341-343. 
!7 Cantor (1870a). 
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converged to zero at every point of the interval (—z, 77), then ag = 0 and a, = b, = 0 
for n > 1. To prove this, Cantor first used the convergence of the trigonometric 
series to produce a tedious proof that a, — 0 and b, — Oasn — ow. Later 
on, Kronecker helped Cantor realize that he could greatly streamline his proof by 
working with a different series. But we continue with Cantor’s original proof based on 
this result. Observe that he could apply Riemann’s second theorem so that the second 
Riemann—Schwarz derivative of 


1 scat 
F(x) = a agx — ye 1 (dn cosnx + by sinnx) 


n=1 


was zero in (—z,7). By Schwarz’s lemma, F(x) was a linear function ax + b, and he 
had 


coe | 1 
ee = (dy cosnx + by, sinnx) = = agx? —ax —b. 
n 4 


n=1 


Since the left-hand side was periodic, ag and a had to be zero. Because the series was 
uniformly convergent, Cantor could multiply by cos mx and sin mx and integrate term 
by term to obtain 


as 1s 
DO -b f cosmxdx =0, 72" = -b f sinmx dx = 0, 
m = m Lag 
for m > 1. This concludes Cantor’s original proof. Observe that as a student of 
Weierstrass, he was quite familiar with uniform convergence and its connection with 
integration, but at that time the concept of uniform convergence was not well known. 
To take care of the first step concerning a, and b,, Kronecker pointed out that it 
was not necessary to prove that these coefficients tended to zero. Instead, he called the 
trigonometric series in the theorem f(x) and defined a new function in terms of u: 


1 
glu) = 5 (FG tu) + f(x —u)) 


CO CO 
= ; ago + San cosnx + by sinnx)cosnu = ; ag + by An COS nu. 
n=1 n=1 

Since the series f(x) converged, g(u) also converged and therefore An = 
a, cosnx + b,sinnx — 0 asn — oo for all x in (—z,z). With this new first 
step, Riemann’s second theorem could then be applied, using g(u) instead of f(x), 
yielding A, = 0 forn > 1. Thus, a, = 0 and b, = 0 for n > 1, so he also had 
ao = 0. Though Kronecker assisted Cantor with this argument, the germ of the idea 
was already in Riemann’s paper. 

Cantor extended the uniqueness theorem in an 1871 paper!® by requiring conver- 
gence to zero of 50 + poem Ap at all but a finite number of points in (—z,z7). 


18 Cantor (1871). 
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He supposed x, to be a point at which the series did not converge. Now by Cantor’s 
first proof, on the left-hand side of x,, F(x) = kyx + 1, for some constants k, and 
l,, whereas on the right-hand side, F(x) = ky41x + 1,41. Now because F(x) was 
continuous, kyxy +2, = ky41xy + /, and by Riemann’s second theorem 


i F(x +a) —2F(x,) + F(x — @) 


li 
a—>0 a 
é Xy(ky41 ky) 4 ly4 ly 4 a(ky+1 —k,) 
= lim = 0. 
a—>0 a 


This implied that k,4; = k, and /,4,, = J; therefore, F(x) was defined by the 
same linear function in the whole interval (—z,7). 

Cantor then extended the argument to an infinite set with a finite number of 
limit points. Summarizing his argument, suppose x1, x2, x3,... to be a sequence 
with one limit point x. Then, by the previous argument, the isolated points x1, x2, 
x3,... can be removed, and then finally, after an infinite number of steps, x is isolated 
and can be removed. Kronecker was horrified at this mode of argument, involving 
the completion of an infinite number of steps; he suggested to Cantor that he refrain 
from publishing his paper. But to Cantor’s way of thinking, this kind of reasoning was 
quite legitimate, since he subscribed to the concept of a completed infinity. Cantor 
gave further extensions of his uniqueness theorem to more general infinite sets. The 
enterprise led him to turn his attention toward set theory rather than analysis, and he 
spent the rest of his life creating and developing the theory of infinite sets. 


22.8 Exercises 


(1) A solution of an equation agx” +ayx"—!4..-+a, = 0 where ag, ay, ...,dp 
are integers is called an algebraic number. Let |ao|+|a1|+- --+|a@n|-+n be the 
height of the equation. Show that there exist only a finite number of equations 
of a given height. Use this theorem to prove Dedekind’s result that the set 
of algebraic numbers is countable, that is, the set can be put in one-to-one 
correspondence with the set of natural numbers. This theorem and this proof 
appeared in print in an 1874 paper of Cantor. Dedekind had communicated 
the proof to Cantor in November 1873. Uncharacteristically, Cantor did not 
mention Dedekind’s contribution. See Ferreiréds (1993) for references and a 
possible explanation. 

Read Wilbraham (1848); this paper contains the first discussion of the Gibbs 
phenomenon, dealing with overshoot in the convergence of the partial sums of 
certain Fourier series in the neighborhood of a discontinuity of the function. 
See Hewitt and Hewitt (1980) for a detailed discussion and history of 
the topic. 

In his paper, Riemann gave the function f(x) = f(x” COS +), where 0 < 


(2 


wm 


(3 


wm 


v< 5 as an example of an integrable function, not representable as a Fourier 
series and having an infinite number of maxima and minima. Analyze this 
claim. 


(4) 


(5 


wm 


(6) 


(7) 


(8 


wm 


(9) 
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Show that the series °°, ™ 7 converges to a continuous function. Prove 
that the function does not have a derivative at ¢z if ¢ is irrational; prove 


the same if ¢ = 4 or ait for peau: A and B. Show that when 
C= sant the function has a amie = = —5. In 1916, Hardy proved the 


nondifferentiability portion of this above result; in 1970, J. Gerver proved 
the differentiability portion. In his lectures, Riemann discussed this series, 
apparently without stating the theorem. Weierstrass was of the opinion that 
Riemann may have intended this to be an example of a continuous but 
nondifferentiable function. Unable to prove this, Weierstrass constructed a 
different example, given in the next exercise. See Segal (1978). 

Show that the function f(x) = bane. b" cos(a"mx), where 0 < b < 1; 
a is an odd integer; ab > 1 + r and the function g(x) = )°-°, cosinls) 
are both continuous and everywhere nondifferentiable. Weierstrass presented 
the first example in his lectures, and Paul du Bois-Reymond published it in 
1875. G. Darboux published the second example in 1879. See Weierstrass 


(1894-1927) vol. 2, pp. 71-74. 


Let t = 3 , with c, = 0 or 1, be the binary expansion of 0 < t < 1. Set 
f= one where a, denotes the number of zeros among c1,C2,...,Cn if 


co = 0; if co = 1, then a, denotes the number of ones. Prove that f(t) is 
continuous and single-valued for 0 < ¢ < 1 and that f(f) is not differentiable 
for any t. See Takagi (1990), pp. 5-6. Teiji Takagi (1875-1960) graduated 
from the University of Tokyo and then studied under Schwarz, Frobenius, and 
Hilbert in Berlin and Gottingen 1898-1901. Even before going to Germany, 
Takagi studied Hilbert’s 1897 Zahlbericht. His thesis proved the statement 
from Kronecker’s Jugendtraum that all the abelian extensions of the number 
field Q(,/—1) can be obtained by the division of the lemniscate. Takagi 
did his most outstanding work in class field theory; he was one of the first 
Japanese mathematicians to begin his career after the transition to Western 
mathematics in Japan, and he was instrumental in establishing a tradition 
of algebraic number theory there. See Miyake (1994) and Sasaki (1994). 
These two papers, along with other papers of interest, are contained in Sasaki, 
Sugiura, and Dauben (1994). 

For Bolzano’s example of a continuous nowhere differentiable function, 
dating from about 1830, read Strichartz (1995) pp. 403-406. He gives a 
graphical presentation and points out that it has close connections with 
fractals. 


Show that the series )°°°, sins converges to a function not integrable 
in any interval containing the origin. Then derive the conclusion that this 
trigonometric series is not a Fourier series. This example is due to P. Fatou 
and is referred to in Lebesgue (1906) p. 124. 


Prove W. H. Young’s theorem that if go > gi > --- form a monotone 
descending sequence with zero as limit, and their decrements also form a 
monotone descending sequence, viz., go — g1 = 41 — q2 => ---, then the 


trigonometric series 
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(11) 


(12) 
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(oe) 


1 
1 40+ Yoancosns 


n=1 
is the Fourier series of a positive summable function. Use this to prove that 
[o,@) 
> COS NX 
Cc ? 

ae (nn) 
where c > O is a Fourier series. For this and the next exercise, see G. C. Young 
and W. H. Young (2000) pp. 449-478. 


Prove that if g} > q2 => --- form a monotone descending sequence of 
constants with zero as limit and )~°°., n~!qn converges, then 


[e,) 
) Qn SIN NX 
n=1 


is the Fourier series of a summable function bounded below for positive values 
of x and bounded above for negative values of x. See Exercise 9. 
Prove that if f € L'(—z,z), then the Poisson integral 


1 * 1-—r? 
= / ft) a dt 
Qn Jin 1—2rcos(t —x)+r2 


converges almost everywhere (a.e.) to f(x) as r — 17. See Fatou (1906). 

Given a series ys 1 An, An = a, cosnx + b, sinnx, define its conjugate 
as the series ye ea By, where B, = —by,cosnx + an sinnx. Suppose then 
that $~°° , (a? + b2) < oo. Prove the Riesz-Fischer theorem that there exist 
functions f, g € L? (—z,7) such that f ~ )*> Ap and g ~ )~ By. Show also 


Lusin’s result that 
1 i FO) 1-r a 
Qn Jun 1 —2rcos(t — x) +r? 
1 f* in(t — 
= - | g(t) aes) dt = f(x) ae. 
N 


oe 1 — 2rcos(t — x) +r? 


Next, deduce the formula for the Cauchy principal value integral: 


dt 


1 
lim -|/ g(x +t) ——— = f(x) ae. (22.11) 
€<|t|<a 2 tan 5 


For an arbitrary function g, the conjugate g is defined by the negative of the 
principal value integral in (22.11). If g € L!, then in general g might or might 
not be in L!, but g € L? for0 < p < L.If g € L?, for p > 1, then Z € L?. 
Note that Lusin proved the last result when p = 2. See Lusin (1913). Nikolai 
Lusin (1883-1950) was a student of Dmitri Egorov (1869-1931) at Moscow 


(13) 


(14) 


(15) 
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University and he founded an important school of mathematics there with 
students such as Kolmogorov, Menshov, and Privalov. They developed what 
is now called the complex method in Fourier analysis. 


Concerning S, f, the nth partial sum of the Fourier series of f, given by 
> r= Ak, show that 


n 


Saf) = Sax cos kx + by sin kx) 
k=1 


1 7 1 cos (n + 4)t 
--/ g(x +1) ; Cs) dt, 
a 2 tan 5 2 sin 5 


2, 


where the two integrals on the right-hand side should be taken as Cauchy 

principal values. Combine this with the result in Exercise 12 to show 

that lim S, f(x) = f(x) ae. if and only if the principal value integral 
nC 

satisfies 


cos nt 4 
=0 ae. 


sim, [ g(x +f) 


From (22.11) it follows that the principal value integral Ze gro dt exists 
ae. for g € L?. Lusin also had an example of a continuous function g 
with 7, [222 
for this principal value integral to converge, there must have been a good 
deal of cancellation. Lusin conjectured the almost everywhere convergence of 
the Fourier series of square integrable functions because he thought that the 
cancellation in the principal value integral was the reason for the convergence 
of the series. Kolmogorov (1923) contains an example of an integrable, but 
not square-integrable, function whose Fourier series diverged everywhere. 
Lennart Carleson proved Lusin’s conjecture in 1966, and Richard Hunt soon 
extended Carleson’s theorem to L? functions with p > 1. One of the 
important concepts needed in the Carleson and Hunt proofs was that of 
maximal functions. For a locally integrable function f, the Hardy—Littlewood 
maximal function is defined by 


dt = oo on a set of positive measure. Note that in order 


1 x+h 
Mf (x) = sup =f ; | f(t)| dt. 


Prove that if f ¢ L? (—z,z) for1 < p < o, then f e€ LP and lfllp < 
Cp Ilfllp- This theorem is due to Marcel Riesz (1928). Also deduce that 
Sn fllp < Cp MIF llp- 

Show that if f € L? for 1 < p < ~, then 


IMP lp < Coll fllp- 
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Show also that if f(r,x) is the Poisson integral of f, then 


a If(.x)| < CMf(x), 


O<r< 


and hence < Collfllp- 


P 


sup |f(r,x)| 


O<r<l 


These results were published in 1930 by Hardy and Littlewood; see 
Hardy (1966-1979) pp. 509-544, especially pp. 530-538. 


22.9 Notes on the Literature 


For Carleson’s proof of the convergence theorem, mentioned in Exercise 13, see 
Carleson (1966). Hunt’s extension can be found in Haimo (1968), pp. 235-255. For 
historical background on the convergence of Fourier series, see Hunt’s paper in Butzer 
and Sz.-Nagy (1974). 

Laugwitz (1999) presents a lively account of Riemann’s life and mathematical 
work, including trigonometric series and complex variables. Riemann (2004) contains 
an English translation of his papers and lectures. Hawkins (1975) presents a detailed 
but very readable account of the development of integration theory from Riemann to 
Lebesgue. For a modern discussion of Riemann integrability, see Bressoud (2007), 
p. 251 and for more on Lebesgue, see Bressoud (2008). 

Purkert and Ilgauds (1985) contains the correspondence between Cantor and 
Schwarz. Cantor (1932) contains his work on the uniqueness of trigonometric series. 
See Dauben (1979) for a discussion of the development of Cantor’s mathematical 
thought. Cooke (1993) is an interesting history of the work on the uniqueness of 
trigonometric series, and it also surveys recent contributions. The article by Zygmund 
in Ash (1976) contains some insightful remarks on the development of Fourier series. 


23 


The Hypergeometric Series 


23.1 Preliminary Remarks 


The hypergeometric series and associated functions are among the most important 
in mathematics, partly because they cover a large class of valuable special functions 
as either particular cases or as limiting cases. More importantly, because they have 
the appropriate degree of generality, very useful transformation formulas and other 
relations can be proved about them. The hypergeometric series is defined by 


sD 1)-bdb+1 
F(a,b,c,x) =2F\ Cee eee a(a+1)-b(b+ 2 


l-c 1-2-c(c+1) pee e uaa 


The expressions involved can be written more briefly if we adopt the modern notation 
for the shifted factorial: 


(a)y =a(at1)---(a+tn—1) forn>1, @*o=1. (23.2) 
Thus 
a,b. (ann n 
Plarbye.s) =2F ( : 2) =o ee . (23.3) 


The subscript notation in F was introduced in the twentieth century when similar 
series with varying numbers of parameters, such as a,b,c, were considered. Note the 
following examples of hypergeometric series in Gauss’s notation: 


1 iin 
(ar aie tee ele a 
1-—x 22: 


x\o 2 
ms lim F(aratt.- 22). 


x 
* = Tn dls ed, = 
. ra ( =) a) Ta + 1) abo 4ab 


a>~w 


657 


658 The Hypergeometric Series 


Historically, hypergeometric series occurred not only in the study of power series 
but also as inverse factorial series in finite difference theory. James Stirling, in partic- 
ular, employed them in the approximate summation of series and in this connection 
also discovered special cases of important transformation formulas. However, in 1778, 
Euler first introduced the hypergeometric series in the form (23.1). He proved! that the 
series satisfied the second-order differential equation: 

d°F 


dF 
x(1 — x) PO t(c—(a+b+1)x) aig abF = 0, (23.4) 


and then used this equation to prove an important transformation formula: 
F(a,b,c,x) = (1 —x)°4? F(c — a,c — b,c, x). (23.5) 


The binomial factor can be moved to the left-hand side, as (1 — x)“+9—°. When this is 
expanded as a series and multiplied by the hypergeometric function on the left-hand 
side, the coefficients of x” on the two sides give the identity 


n 


3 (a) Bea +B =~ C)n—K _ (C= @)n(€ ~ b)n (23.6) 
k\(c)z(n — k)! n!(c)n 
k=0 
or 
3 (=n )k (ak (d)x _ = anle ~ Dn (23.7) 
= ki(c)jkd t+atb—c—n),  (c)n(C—a—b)y 
or in the following modern notation, whose meaning is obvious from (23.7): 
—n,a,b : a (c—a)n(c — D)n 
. orga teat) =e O28) 


Observe that this identity is formally equivalent to (23.5). 

In 1797, Johann Friedrich Pfaff (1765-1825) proved Euler’s transformation (23.5) 
by giving an inductive proof of (23.8).” Pfaff was among the leading mathematicians 
in Germany during the late eighteenth and early nineteenth centuries; he was the 
formal thesis advisor for Gauss. His results on second-order differential equations 
were inspired by Euler, whose work on this topic appeared in his three volumes on the 
integral calculus.? Euler’s work on series provided the starting point for the German 
combinatorial school founded by C. F. Hindenburg (1741-1808), of which Pfaff was 
a member. Pfaff’s formula (23.8) is very useful for evaluating certain types of sums of 
products of binomial coefficients occurring in combinatorial problems. No one seems 
to have taken notice of this work; in order to save this identity and some other of 
Pfaff’s results from oblivion, Jacobi referred to it in a paper of 1847.4 We remark 


! Bu. 1-169 pp. 41-55. E710. 

2 Pfaff (1797b). 

3 Bu. I-11, 12, 13. E 342, E 366, E 385. 

4 Jacobi (1969) vol. 6, pp. 174-182, especially p. 178. 
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that Jacobi was interested in the history of mathematics and consistently attempted 
to give credit to the original discoverer of a concept or formula. In spite of Jacobi’s 
efforts, this identity was forgotten for many years. In 1890, it was finally rediscovered 
and published by L. Saalschiitz,> with whose name it was associated for many years. 
In the 1970s, Askey noticed Jacobi’s reference and renamed it the so that it is now 
called the Pfaff—Saalschiitz identity.° Pfaff could not have foreseen that in the 1990s, 
his method of proving (23.8) would become the foundation of George Andrews’s 
general method for proving hypergeometric identities useful in computer algebra 
systems. Pfaff also found the terminating form of another important hypergeometric 
transformation: 


F(a,b,c,x) = (1—x) “°F (ac — b,c, Z :) . (23.9) 
x— 
Note that Pfaff took the parameter a to be a negative integer so that the series 
on both sides were finite. Pfaff derived this formula from a study of the differential 
equation 


d? d 
x7(a + bx”) ! + x(c + ex”) u 
dx 


a t(f +ex")y =X, 


where X was a function of x. Euler earlier discussed the homogeneous form of this 
equation in his book on the integral calculus. Note that Newton’s transformation (10.4) 
is a particular case of (23.9), obtained by taking a = 1, b = 5 c= 3, and x = —??. 
Stirling’s formula (10.13), obtained by equating the series in (10.31) and (10.32), can 
also be derived from (23.9) by taking a = —l and x = - It is possible that Pfaff was 
motivated to study the series in (23.9) by Hindenburg’s 1778 work’ on the following 
problem: For given numbers « and 8, transform a series ay + by? + cy? +--+ toa 
series of the form 


Ay | By? | cy? | . 
at py (@t+py? @+pye  ” 
thus, determine A, B, C,... in terms of a, b,c, .... 


Gauss was the first mathematician to undertake a systematic and thorough study 
of the hypergeometric function. His treatment of the subject appeared in a paper of 
1813. It is possible that Gauss was introduced to the topic when he visited Helmstedt 
in 1799 to use the university library and rented a room in Pfaff’s home. One imagines 
this to be very likely, since Gauss and Pfaff took walks together every evening and 
discussed mathematics. Gauss does not refer to earlier work on hypergeometric series 
so it is hard to determine what he had learned from others. The two most notable 
features of Gauss’s contributions to hypergeometric series were his use of contiguous 
relations to derive the basic formulas and his determination of the conditions for the 
convergence of the series. Some of his unpublished work shows that he wanted to 


5 Saalschiitz (1890). 
6 Askey (1975) p. 62; Andrews et al. (1999) p. 69. 
7 Hindenburg (1778). 
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build the foundation of analysis on a rigorous theory of limits, for which purpose he 
carefully defined the concepts of superior and inferior limits of sequences. 

Gauss defined functions contiguous to F'(a,b,c,x) as those functions arising from 
it when the first, second or third parameter a,b,c was increased or diminished by 
one while the other two remained the same. Gauss may have seen the importance 
of contiguous functions by reading Stirling’s 1730 Methodus. He found that there 
was a linear relation between F(a, b,c,x) and any two contiguous functions; such an 
equation is now called a contiguous relation. Clearly there would be (5) = 15 such 
relations, and Gauss listed all of them in the first section of his 1813 paper. From 
these relations he derived continued fractions expansions of ratios of hypergeometric 
functions, his fundamental summation formula for F(a,b,c,1), and the differential 
equation for F(a,b,c,x). He derived the latter in the second (unpublished) part of 
his paper. In this part, Gauss derived transformation formulas in the same manner as 
Euler before him, except that he also gave examples of quadratic transformations. For 
example: 


1 1 
F («.6.0 +b+ 54x - 4:7) =F (20,20,4 bb =) (23.10) 
Gauss treated a,b,c, and x as complex variables and in this connection he pointed 
out that it was necessary to exercise care when dealing with values of x outside the 
circle of convergence of the series. Thus, when x was changed to 1 — x in (23.10), the 
left-hand side would remain unchanged, leading to the evidently contradictory result 
that 


1 1 
F (20,20 ++ =) =F (20,20,0-+ 5+ 51 -x). (23.11) 


Gauss called this result a paradox and his explanation, from the unpublished portion 
of his paper, is highly interesting, showing that as early as 1812 he was thinking of 
analytic continuation of functions:® 


To explain this, it ought to be remembered that proper distinction should be made between the two 
significations of the symbol F, viz., whether it represents the function whose nature is expressed 
by the differential equation [(23.4)], or simply the sum of an infinite series. The latter is always 
a perfectly determinate quantity so long as the fourth element lies between —1 and +1, and care 
must be taken not to exceed these limits for otherwise it is entirely without any meaning. On 
the other hand, according to the former signification, it [F'] represents a general function which 
always varies subject to the law of continuity if the fourth element vary continuously whether 
you attribute real values or imaginary values to it, provided you always avoid the values 0 and 1. 
Hence it is evident that in the latter sense, the function may for equal values of the fourth element 
(the passage or rather the return being made through imaginary quantities) attain unequal values 
of which that which the series F represents is only one, so that it is not at all contradictory that 
while some one value of the function F (a, b,a+b+ 5.4y —Ayy) is equal to F(2a,2b,a+b+ 5 y) 
the other value should be equal to F(2a,2b,a + b+ ie 1 — y) and it would be just as absurd to 


deduce thence the equality of these values as it would be to conclude, that since Arc. sin 5 = 30°, 


8 Gauss (1863-1927) vol. 3, pp. 226-227. For the English translation, see Kikuchi (1891) pp. 144-145. 
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Are. sin 5 = 150°,30° = 150°. — But if we take F in the less general sense, viz. simply as the 
sum of the series F’, the arguments by which we have deduced (23.10), necessarily suppose y 
to increase from the value 0 only up to the point when x[= 4y — 4yy] becomes = 1, i.e. up to 
y= 5. At this point, indeed, the continuity of the series P = F(a,b,a+b+ 5.4y — 4yy) is 


interrupted, for evidently a jumps suddenly from a positive (finite) value to a negative. Thus in 


this sense equation (23.10) does not admit of being extended outside the limits y = 5 - Fe up 


toy= 5. If preferred, the same equation can also be put thus: — 


1 1 1-VJ1—- 
F(abatb+ =) =F (20.2h0 +64 yo). 


Again, Gauss’s letter of December 18, 1811, to his friend F. W. Bessel (1784-1846) 
shows how far he had advanced in developing a theory of functions of complex 
variables:? 


What should we make of { ¢x.dx for x = a + bi? Obviously, if we’re to proceed from clear 
concepts, we have to assume that x passes, via infinitely small increments (each of the form 
a + if), from that value at which the integral is supposed to be 0, to x = a + bi and that then 
all the @x.dx are summed up. In this way the meaning is made precise. But the progression of x 
values can take place in infinitely many ways: Just as we think of the realm of all real magnitudes 
as an infinite straight line, so we can envision the realm of all magnitudes, real and imaginary, 
as an infinite plane wherein every point which is determined by an abscissa a and an ordinate b 
represents as well the magnitude a + bi. The continuous passage from one value of x to another 
a + bi accordingly occurs along a curve and is consequently possible in infinitely many ways. 
But I maintain that the integral { ¢x.dx computed via two different such passages always gets 
the same value as long as @x = oo never occurs in the region of the plane enclosed by the curves 
describing these two passages. This is a very beautiful theorem, whose not-so-difficult proof I will 
give when an appropriate occasion comes up. It is closely related to other beautiful truths having 
to do with developing functions in series. The passage from point to point can always be carried 
out without ever touching one where ¢x = oo. However, I demand that those points be avoided 
lest the original basic conception of { ¢x.dx lose its clarity and lead to contradictions. Moreover 
it is also clear from this how a function generated by { ¢x.dx could have several values for the 
same values of x, depending on whether a point where @x = oo is gone around not at all, once, 
or several times. If, for example, we define log x via [ fadx starting at x = 1, then arrive at log x 
having gone around the point x = 0 one or more times or not at all, every circuit adds the constant 
+2zi or —277; thus the fact that every number has multiple logarithms becomes quite clear. 


Thus in 1811, Gauss had a clear conception of complex integration and had 
discovered Cauchy’s integral theorem, published by Cauchy in 1825. He had also 
begun to understand the reason for a function being multivalued; this understanding 
informed Gauss’s comments on (23.11). It is possible that Gauss was motivated 
to study quadratic transformations by his discovery during the mid-1790s of the 
connection between the arithmetic-geometric mean and the complete elliptic integral. 
This integral is defined by 


> Gauss (1863-1927) vol. 8, pp. 90-92. For the English translation of this portion of the letter, see 
Remmert (1991) pp. 167-168. 
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2 do Ly 
Ki = | =5F( 142). 
0 V1—ksin2?6 2 22 


In his unpublished paper,!° Gauss also computed two independent solutions of the 
hypergeometric equation in the neighborhood of 0, 1, and oo. He obtained explicit 
formulas linearly relating a solution in the neighborhood of one of these points with 
two independent solutions in the neighborhood of another point. As an example, 
consider Gauss’s result 


_TOPOs@, | , 1 
F(a,b,c,x) = Te-are” F («0 T 1 Cia + 1 b.-) 
pf OBES Dagens, (0.6 +1l—cb+a—a, -). 
P(a)I'(c — b) x 


The functions on the right-hand side were solutions in the neighborhood of infinity. 
Gauss also considered the case where the parameter c was an integer so that the 
second independent solution involved a logarithmic term. Euler was also aware of this 
situation. Gauss went further by showing that the digamma function, (x) = att. 
defined in the first part of his paper, could be employed to obtain an expression for the 
second solution. 

Gauss’s paper was quite influential, especially among German mathematicians, 
who produced much important research on this topic in the next three or four decades. 
In 1833, as part of his doctoral dissertation, P. Vorsselman de Heer gave the integral 
representation!! 


(ee hai aaa 


F(a,b,c,x) = 
fy 2-1 = eb ldt 


(23.12) 


Note that the integral in the denominator is the beta integral, evaluated by Euler, 
equal to Lene, This integral representation of F(a,b,c,x) was independently 
found by Kummer and published a few years later in his long memoir on hyperge- 
ometric functions.!? However, in a posthumous paper, Jacobi attributed this formula 
to Euler, !? though it seems that it does not appear explicitly in Euler’s work. However, 
Euler did give an integral representation of a solution of a differential equation closely 
related to the hypergeometric equation; this may have been Jacobi’s reason for the 
attribution. 

In 1828, the Danish mathematician Thomas Clausen (1801-1885) obtained a 
significant result of a different kind.'+ Clausen was born to poor farming people 
and did not learn to read or write until the age of 12. He encountered many 
difficulties due to his humble origins. But Gauss thought highly of him, and Clausen’s 


10 Gauss (1863-1927) vol. 3, pp. 207-229. For an English translation of this paper, see Kikuchi (1891) 
pp. 121-149. 

!l Vorselman de Heer (1833). 

12 Kummer (1836) § 27. 

13 Jacobi (1859) p. 149. 

14 Clausen (1828). 
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abundant mathematical talent was eventually recognized. He considered the square of 
a hypergeometric series and found 


2 
a a,b. = 2a,2b,a + b : 
(F (a,b,c, x)) = (oF ( Cc x) = 3h) Cee vena 1 3X |, (23.13) 


2 
where 
3 @)n®nOn np Gs «) 
A= ni@a@n Kae} 


In 1836, Ernst Kummer (1810-1893) published the first major work on hyper- 
geometric functions after Gauss.!? He rediscovered much of the material in the 
unpublished portion of Gauss’s paper, including quadratic transformations. In fact, 
these transformations are implicitly contained in Gauss’s published paper. Kummer 
also found some results for 3F2 functions, including the existence of three-term 
contiguous relations when x = 1. Kummer was trained as a high school teacher; 
he taught at that level 1831-1841. In 1834, while serving a year in the army, he 
communicated some papers in analysis to Jacobi who is reported by E. Lampe to have 
commented: “There we are; now the Prussian musketeers even enter into competition 
with the professors by way of mathematical works.”!© However, Jacobi was impressed 
with the work done by Kummer under difficult circumstances and wrote in his reply, 
“Tf you think that I could be of any help with obtaining an academic position, I would 
be happy to offer my humble services — less because I think that you would need them, 
or that they would be significant, but as a token of my great respect for your talent and 
your works.”!” Dirichlet and Jacobi worked to find Kummer a university position. 
Kummer became a professor at Breslau in 1842 and moved to Berlin in 1855, when 
Dirichlet vacated his chair there to take up the position at Géttingen left open by 
Gauss’s death. 

In the 1840s, Jacobi wrote some interesting results on hypergeometric series. In the 
posthumous paper mentioned earlier, he showed that the sequence of hypergeometric 
polynomials F(—n,b,c,x) where n = 0, 1, 2,..., were orthogonal with respect to a 
suitable distribution. Following Euler, he also worked out how definite integrals could 
be employed to study solutions of the hypergeometric equation. In another paper, 
discussed in Section 21.9, he applied the symbolic method to obtain some known 
transformation formulas for hypergeometric functions. 

In a paper of 1857, Bernhard Riemann took a very different approach to hyper- 
geometric functions as part of his new theory of functions of a complex variable.!® 
Riemann gave the foundation of this theory in his famous doctoral dissertation of 
1851.!9 An important idea first given in this work and later applied to the theory of 


1S Kummer (1836). 

16 Kummer (1975) vol. 1, p. 18. 
!7 See Pieper (2007) pp. 214-215. 
18 Riemann (1990) pp. 99-115. 

19 ibid. pp. 35-75. 
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abelian functions, hypergeometric functions, and the zeta function was that a complex 
analytic function was to a large extent determined by the nature and location of its 
singularities. The singularities of the hypergeometric equation are at 0, 1, and oo. In 
his 1857 paper, Riemann considered the more general case where the singularities of 
a function were at three distinct values a,b, and c. He axiomatically defined a set of 
functions, called P-functions, satisfying certain properties in the neighborhood of the 
three singularities, but without reference to the hypergeometric function or equation. 
Riemann showed that P-functions were solutions of a second-order differential 
equation reducible to the hypergeometric equation when the singular points were 0, 
1, and oo. He also developed a very simple transformation theory for P-functions by 
means of which one could derive a large number of relations among hypergeometric 
functions with little calculation. 

We have seen that Gauss emphasized the fact that the hypergeometric series repre- 
sented a hypergeometric function in only a small part of the domain of definition of the 
function. Moreover, the function was multivalued. Perhaps unable to develop a theory 
of complex variables to treat the hypergeometric function to his satisfaction, Gauss 
held back publication of the second part of his paper on the subject. Riemann saw 
Gauss’s full paper in 1855, after Gauss’s death. Surely this problem left pending by 
Gauss provided Riemann with great motivation for his landmark 1857 paper. Riemann 
also had a strong interest in mathematical physics; as he mentioned in the introduction 
to his paper, the hypergeometric function had numerous applications in physical 
and astronomical researches. After 1857, Riemann continued his investigations on 
the theory of ordinary differential equations with algebraic coefficients. His lectures 
and writings on the topic were published posthumously and eventually led to the 
formulation of what is now known as the Riemann—Hilbert problem. 

Felix Klein (1849-1925) was one of the earliest mathematicians to understand and 
propagate the ideas of Riemann. In 1893, he gave a course of lectures on Riemann’s 
theory of hypergeometric functions.”° Interestingly, a decade later, the English 
mathematician E. W. Barnes (1874-1953) presented an alternative development of 
the hypergeometric function, based on the complex analytic technique of the Mellin 
transform, making use of Cauchy’s calculus of residues.7! 

R. H. Mellin (1854-1935) was a Finnish mathematician who studied analysis first 
under Mittag-Leffler in Stockholm and then with Weierstrass in Berlin. He started 
teaching in 1884 at what was later named the Technical University of Finland. He 
founded a tradition of research in complex function theory in Finland, continued by 
mathematicians such as Ernst Lindelof, Frithiof and Rolf Nevanlinna, and Lars V. 
Ahlfors. Mellin gave a general formulation of the Mellin transform in an 1895 treatise 
on the gamma and hypergeometric functions. For a function f(x) integrable on (0, 00), 
the Mellin transform is defined by 


F(s)= / x! FOe)dx. (23.14) 
0 


20 These lectures were published in 1933. See Klein (1933). 
21 Barnes (1908). 
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If f(x) = O(x~“**) as x > 04+ and f(x) = O(x?-*) as x > +00, fore > 0 and 
a < b, then the integral converges absolutely and defines an analytic function in the 
strip a < Res < b. Mellin gave the inversion formula: 


1 c+ooi 
f@= | x “F(s)\ds,a<c<b. (23.15) 
201 Sc—ooi 


In particular, we have the pair of formulas (stated without convergence conditions) 
very useful in analytic number theory: 


le) 1 c+ooi 
T(s) =} x le*dx and e* = = | T(s)x “ds. (23.16) 
0 271 Jc—coi 


In fact, Riemann had already used the Mellin transform in his famous paper on the 
distribution of primes.?* Other particular cases of the transform were derived by 
others, including Mellin himself, before he stated the general formula. The second 
formula in (23.16) was apparently first discovered by the French mathematician 
Eugéne Cahen in 1894.73 His thesis on the Riemann zeta function and its analogs 
contains several interesting results on Dirichlet series, though some of these were not 
rigorously proved until more than a decade later. Cahen followed Riemann in taking 
the Mellin transforms of a function analogous to the theta function to obtain functional 
equations for the corresponding Dirichlet series. He considered some analogs of the 
theta function: 


(oe) 


n _ ax ae n iets - o1(n) —2nmx 
E (jem Eee baw. 


n=1 P n=1 n=1 


where (4) denoted the Legendre symbol and 0; (n) the sum of the divisors of n. Cahen 
employed the first sum when p = 1 (mod 4), and the second when p = 3 (mod 4). 

E. W. Barnes studied at Trinity College, Cambridge, from 1893 to 1896. Most of his 
mathematical work was done in the period 1897-1910 on the double gamma function, 
hypergeometric functions and Mellin transforms, and the theory of entire functions. 
In 1915 Barnes left Cambridge to pursue his second career. He was ordained in 1922 
and appointed to the Bishopric of Birmingham in 1924, an office he held until 1952. 

Barnes’s starting point was the observation that from Euler’s integral represen- 
tation, and by expanding (1 — xt)~“ as a series, the Mellin transform of the 
hypergeometric function would be 

T(c) V(s)P(a—s)l(b—-s) 


sa s—l = 2t 
[ x F(a,b,c, — x)dx = wre res) ; (23.17) 


22 Riemann (1990) pp. 177-185. 
23 Cahan (1894). 
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for min (Rea,Reb) > Res > 0. This suggested the integral representation for the 
hypergeometric function: 


T(ar(b) tie. ie Tis)P(a—s)P(b—s) 
2ni 


TO F(a,b,c,x) = ion neo (—x) "ds, (23.18) 


where min (Rea,Reb) > k > Oandc £0, — 1, —2,.... This is Barnes’s integral 
for the hypergeometric function and provides the basis for an alternative development 
of these functions. A precise statement of the integral formula requires conditions on 
the path of integration. 


23.2 Euler’s Derivation of the Hypergeometric Equation 


We follow Euler’s notation as it is easy to understand and his derivation is quite short 
and straightforward.”* Euler let s denote the hypergeometric series (23.1). Then 


b 
a(xds) = abxo-! 4 —<(a L1)\(b+1)xo +. 
b 
A(x@s) = ax?! 4 = (a+1)x°+--- 
l-c 


Note that, for the sake for brevity, he frequently suppressed 0x. Now 


b 
a(x? +1-49(445)) = abx?—! t 7 (a 1)(b Hx? me 
-C 


= x?-©9(x°ds), 


or 
A(ax?s +x?*1as) = x(cx! ds + x°d4s) 
or 
a(bx—'s + x°ds) + (b+ 1)x?ds + x?t lads = cx?!ds + x? dds. 
Dividing by x?—!, he got the hypergeometric equation 


x(1 —x)dds + (c — (a +b+1)x) 0s —abs = 0. (23.19) 


Euler gave an equally simple proof of the transformation formula. He showed 
that s = (1 — x)”z also satisfied a second-order differential equation with the 
hypergeometric form when n = c — a — b. He started by taking the logarithmic 
derivative of s to obtain 


24 Bu. 1-165 pp. 41-55. E710. 
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0 0 a 
22 oo (23.20) 
Ss Zz 1-x 
The derivative of this equation was 
aa as)? aa az)? ax)? 
s (ds) _ ddz (92) n(x) (23.21) 
RY s2 Zz Zz (x) 
We remark that Euler wrote 0s for (ds)*. He then squared (23.20) to get 
(ds)? (0z)*_—- 2ndxdz__ nn(dx)? 
se ex) (Lx)? 
He added this equation to (23.21) to get 
das _ 902 2ndxdz | n(n — 1x)” (23.22) 


5 z 2i—-x)' (l—x) 


When (23.20) and (23.22) were applied to the hypergeometric equation (23.19), he 
could write 


ddz  2nxdxd a 
x(l— x) EE e-atb+ 0x) = 
z Z a 
| oa ey mere DS) Ot ep ty (23.23) 
—x —x 


Next, the two terms with 1 — x in the denominator, the second of which had a 
suppressed 0x, combined to form 


n((n+a-+b)x —c) 
l-x , 


When n + a+b =c, the factor 1 — x cancelled. For this n, (23.23) was reduced to 


x(1—x)ddz+ [c+ (a+b — 2c — 1)x]dz — (c — a)(c — b)z = 0,7 (23.24) 
an equation of the hypergeometric type. Thus, 
z= F(c—a,c—b,c,x) = (1—x)*t? F(a, b, ¢, x). (23.25) 


This proved Euler’s transformation (23.5). 


23.3. Pfaff’s Derivation of the 3 F2 Identity 


We have already noted that equation (23.25) is equivalent to Pfaff’s identity (23.7). 
Pfaff gave a very interesting proof of this,2> given here in modern notation using 
shifted factorials. Let 


25. Pfaff (1797b) pp. 51-52. 
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n 


: (—n) j (aj); 
Sn(a,b,c) = ) j!(@jd—-n+a+b—o)j 


j=0 
Then, by a simple calculation, 
Sn (a,b,c) — Sn—1(a, b,c) 
= > ( (—n);(a)j(0)j (1 —n)j(a)j)j ) 
JMO) FIL =n hehe =oe: JOO t db =o); 


—(l+a+b-—c)ab 
= Sie 1b+1, 1). 23.26 
ci+tat+b—c—n)\2Q+a+b—c—n) Ae: ae) ( ) 


j=0 


By induction, the recurrence (23.26), combined with the initial value So(a,b,c) = 1, 
uniquely determines S, (a,b,c). Pfaff could easy verify that 


(c —a)n(c — b)n 
(c)n(c —a—b)n 


satisfied the same recurrence relation and initial condition, proving his formula (23.7). 

This formula is quite useful and important, though this does not seem to have 
been realized until the twentieth century when it found applications to the evaluation 
of combinatorial sums of products of binomial coefficients. In this connection, the 
Chinese mathematician Li Shanlan (1811-1882) is of historical interest. He was 
trained in the Chinese mathematical tradition, though later in life he came to learn 
about Western works on algebra, analytic geometry, and calculus.*® At the age of 8, 
he studied the ancient Chinese text Jiuzhang Suanshu, and six years later he read a 
Chinese translation of the first six books of Euclid’s Elements. Soon after that, he 
studied Chinese works on algebra and trigonometry. Eventually he became interested 
in the summation of finite series. He made some interesting discoveries involving 
Stirling numbers, Euler numbers and other numbers and series of combinatorial 
significance, contained in his work Duoji Bilei. This may be translated as “Heaps 
Summed Using Analogies;” heaps refer to finite sums. In this work, Li Shanlan 
developed and generalized the concepts and formulas of earlier researchers such as 
Wang Lai (1768-1813) and Dong Youcheng (1791-1823). Li Shanlan presented the 
following summation formula: 


k 2 2 
- k n+2k—j\ _(n+k 
@ ( 2k )=( k ) , re 


j=0 


On (a, b,c) = 


This formula was brought to the notice of the Hungarian mathematician Paul 
Turan (1910-1976) in 1937. He gave a proof using Legendre polynomials, published 


26 Martzloff (1997) pp. 341-350. 
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in 1954.7’ This aroused the curiosity of other mathematicians, and it was established 
that the combinatorial sum (23.27) could be written as 


n+ 2k —k, —k, —n. 
( 2k )ara( 1, —n—2k 11). 
and therefore (23.27) could be derived from Pfaff’s formula. Jacobi’s perceptive effort 
to prevent this formula from being forgotten provides further evidence of his insight 
into formulas and his stature as an algorist. 
As another application of Pfaff’s identity, note that it can be written as 


n 


se (—n)x(a)x(b)x — (C= 4)n (c—b)n nin 


c-l c—a—b-1 


nin 


kV (Qx(—n t+] +atb—o,. nine! nine-F1 (0), (Ca — by 


k=0 
When n — oo and Re(c — a — b) > O, by (17.4), we obtain Gauss’s 2 F; summation 
mentioned earlier as (17.14): 

l(c) (ec —a—b) 
(ce —a)F(e — db)’ 


F(a,b,c,1) = (23.28) 


though we do not know whether Gauss was aware of this derivation. Note that 
for a= —™m, a negative integer, (23.28) reduces to the Vandermonde identity 
(Chu—Vandermonde identity), discussed in Section 25.10. 


23.4 Gauss’s Contiguous Relations and Summation Formula 


The contiguous relations can be given in compact form if we use the following notation 
for contiguous functions: 


F = F(a,b,c,x), F(at+) = F(a+1,b,c,x), ete. 


Gauss wrote down all of the fifteen contiguous relations connecting F with two 
functions contiguous to it.78 Here we give four examples: 


(c — 2a — (b—a)x)F +a(1 — x)F (a+) — (c—a)F(a—) = 0, (23.29) 
(c—a—b)F+a(1 —x)F(at+) — (c—b)F(b—-) = 0, (23.30) 
(c —a—1)F +aF(at) —(c-1)F(c-) =0, (23.31) 


c(ec—1— Qc-—a—b—1)x)F + (c—a)(c — b)xF(ct+) 
c(e— 1d —x)F(c—) = 0. (23.32) 


27 Turan (1990) vol. 1, pp. 743-747. 
28 Gauss (1813) § 7. 
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From the fifteen relations, one may obtain other relations in which more than one 
parameter is changed by one or more; we give a relation presented by Gauss, where 
our notation has the obvious meaning. 

a(c — b)x 


F(b+,c+)-F= CLD F(a+,b+,c+2). (23.33) 


Gauss proved relations (23.30) and (23.31): First, let 


_ (a+ In-1()n-1 


n!(C)n 


M 


Then the coefficients of x” in F, F(b—), F(a+), F(c—), and x F(a+) would be 


a(b+n—1)M, a(b—1)M,(at+n)(b+n-—1)M, 


AAD EI ENC DIM oahu NF 


c-—1l 


respectively. To obtain (23.31), it was therefore sufficient for him to check that 


a(c—a—l1)(b4+n—1)+a(at+n)\(b+n—-1)-aQbt+n—1)(c+n—-1)=0. 


Equation (23.30) can be proved in a similar manner; equation (23.33) can also be 
proved by the direct method. Gauss found his formula (23.28) for F(a,b,c,1) by 
taking x = 1 in (23.4) to obtain 
— —b 
FC Ce ee cn RE (23.34) 
c(c—a—b) 
Note that he proved the convergence of the series for Re(c — a — b) > 0; thus, the 
series on the right-hand side also converged. By repeated application of this equation 
he got 
— —b 
Fe be eo ON a een), (23.35) 
(C\n(c —a— b)n 
Gauss could then express the right-hand side of the equation in terms of the gamma 
function, just as we obtained (23.28); he then let n — oo to get the result. 


23.5 Gauss’s Proof of the Convergence of F (a, b,c,x) forc—a—b>0 


Gauss’s proof of this important result was based on the formula 


Oa 1). = Ox) =p , — Det 


1)(1¢- fs. 
ae r( BBB+) Bk Bk 


This summation formula follows immediately from the following relation; although 
Gauss did not state it explicitly, he knew it well from his numerous calculations with 
hypergeometric series. 


(23.36) 
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(Ok +1 (p= e= 4) (Da 

(B)e-1 (Bk (Pk 

A simple algebraic calculation is sufficient to check this relation. The idea was to 

write a hypergeometric term as a difference of two terms. It is interesting that in 1978, 

Bill Gosper showed the tremendous effectiveness of this approach in the summation 

of series of hypergeometric type. Gosper’s method is now one of the fundamental 
algorithms used to sum such series. 

Now note that the ratio of the (n + 1)th term over the nth term of the series 

F(a,b,c,x) (omitting x) is 


(23.37) 


2 
(a+n)(b ny _n at bn ab (23.38) 
(1 +n)(c +n) ne+(c+ Inte 


We take a, b, c real, though the argument also applies to complex values. Gauss 
proved, more generally,”° that if the ratio of the consecutive terms in a series was 


Wo An © a Bt 2 Cn 4 a: 
n* + an4—! + bn4-2 + cnr>-3 +... 


(23.39) 


and A — a was a negative quantity with absolute value greater than unity, then the 
series converged. And when this result is applied to the special case of F(a, b,c, x), it 
follows from (23.38) that the hypergeometric series converges forc + 1-a—b>1 
or c— a—b > 0. To prove the theorem, write the series, for which the ratio of terms 
is given by (23.39), as M, + Mz + M3+.---. We remark that Gauss did not use 
subscripts; he wrote the series as M + M’+ M” +.---.Now since a > A + 1, there 
is a sufficiently small number / such thata —h > A+ 1, ora—h—1> A. Now 


observe that if the fraction (23.39) is multiplied by ie 7> we have 


no Mns1 ntl And 4... 


n—-l1—h M,  n+!4+(a—h—-1)n*+.-- 


If n is large enough, the last ratio is less than 1. Suppose this true form > N. Then 


N-1-h 
|Mn+i] < ——— |Mnl, 


N 
aie NTeaay | (N=h=DW =H) yy 
42| < < ; 
N+2 Nope! NWHD N 


(N—h—1)(N—h)---(N-h—-14+k-1) 
[Myaxl < Ne [Mv|. 


29 Gauss (1813) § 16. 
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Hence, 


[My | + |Mns+il +--:+|Mwnel 


N=ho1 OW = hh DNF) (N —h—1)k 
=IMyi (14 fe++ 4 ) 
N N(N +1) (N)x 
_ Mn! (w i (N-—h- Met) 
h (N)x 


following from (23.36). The term eae tends to zero as k —> oo because 


ee ee _ = (oe 
Oat im (TR) (Caer). aay 


io Ne CN) ) LRN kh 


Now the two expressions in parentheses on the right-hand side of (23.40) have the 


limit ayy While 


Thus, Gauss proved that 


= N-1 
do Ml < =— |My, 
k=N 


and the convergence of )-°°_, M, followed. 
Observe that Gauss’s method leads to a great refinement of the ratio test. 


23.6 Raabe’s Test for Convergence 


Joseph Raabe refined Gauss’s convergence result into a general test for convergence 
of a series. In a paper of 1832,*° he stated his convergence test: 


Let ag + a; + a2 + --- be a series of positive terms. Suppose limp. 
n (“4 — 1) = k. Then the series converges when k > | and diverges when k < 1. 


Raabe proved this theorem in section 11 of his paper. In section 1, he proved the 
integral test and in section 2 he applied it to show that the series 


1 1 
pains eeeeieey rey? 


1 om 3m 


converged for m > 1 and diverged when m < 1. Then in section 7 he applied this 
result to prove that the series 


30 Raabe (1832) pp. 63-64. 


23.6 Raabe’s Test for Convergence 673 


1 1 1 
l+m (+m(1+%)° G+m(1+3)0+2) 
ae tle eee) | Lids 
~L+m' l+tm2+m)*) Atm2+mB+m) 


beer | O3A15 


converged when m > | and diverged when m < 1. To verify this result, he noted that 
if u, denoted the nth term of (23.41), then 
n\n” 


~ (m+ Dn 


u,n”™ 
Next, according to (17.5), 
lim u,n” =T(m+1), 
n—->co 
o ol 


so Raabe could conclude that }°°° , u, would behave as the series [(m+1) )°°~_, aii 


so that the result was verified. 
To prove his main theorem, Raabe observed that for m > k and N large enough, 


1 1 
an+1 +an42+°:: > aN f ; foes 
( tw (1+) + wh) 


and for m < k and N large enough, 


1 1 
over haya te say (het ot). 
CN NIT NF 


The convergence of (23.41) for m > 1 and its divergence form < 1 completed the 


proof of Raabe’s convergence theorem. 
In section 12 of his paper, Raabe deduced Gauss’s convergence result, given in 


Section 23.5 of this chapter. He noted that 


h-1 


an n’? + ain an 
an+1 oe nh + Ain't! nme An 
implied, with w = 4, that 
a ( 2 1) = TAD Ft @ = Arbo tes + (an = Ano"! 
Gn - 1+ Ajo+-+-+ Anal 


Hence 


confirming that the series converged when a;—A;,>1 and diverged when 
a, — A\ <i. 
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23.7 Gauss’s Continued Fraction 
Gauss derived an important continued fraction from the contiguous relation (23.33).°! 
He set 


F(a,b+1,c+1,x) 


CMe Gay F(a,b,c,x) 


so that 


F(a+1,b,c+1,x) 7 Fi(b,at+1,c+1,x) 
F(a,b,c,x) 7 F(b,a,c,x) 


= G(b,a,c,x). 


Then, dividing (23.33) by F(a,b + 1,c + 1,x), he obtained 


1 _ alc —b) 
G(a,b,c,x)  c(c +1) 


*G(b-+ 1,a,¢e-+- 1,2), 


or 
1 
G(a,b,c,x) = Ea) : (23.42) 
1 cer * Gb + la,c+1,x) 
This process could be continued: 
Gib+1 +1,x) ; 
a,c x)= 
es , (b+1)(c+1~a) ‘ 
LE xG(a+1,b+1,c+2,x) 
and thus 
Siege (23.43) 
1- 1- 1- 1l- 1 
where 
tn—b b _ 
oe (a+n)(c+n ) and (b+n)(c +n — a) (23.44) 
(c + 2n)(c + 2n 4+ 1) (c + 2n — 1)(c + 2n) 
Gauss mentioned an important particular case: when b = 0. In that case, 
G(a,0,c — 1,x) = F(a,1,c,x), (23.45) 
and the formulas in (23.44) took the form 
_] rtf Sn 
gt AEN Bae gt eee SE Y— (93,46) 
(c + 2n — 1)(c + 2n) (c + 2n — 2)(c +2n — 1) 


31 Gauss (1813) § 12. 
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Fora = 1 andc = 3 and x = t”, Gauss had 


Pad 2 et xt? Sar? 331? 
) = ~ 
Med: id= i= a 


This continued fraction played a fundamental role in Gauss’s theory of numerical 
integration. 


23.8 Gauss: Transformations of Hypergeometric Functions 


Gauss found solutions of the hypergeometric equation other than F(a, B,y,x) and 
also used the hypergeometric equation to obtain transformation formulas,** just as 
Euler had done. Note that Gauss used the symbols a, 6, and y and employed a, b for 
variables in a different context. We shall follow that practice here. He set x = 1 — y 
in the hypergeometric equation to get 


ddP dP 
(y Waa tat p+ y—-(@+6+1)y) aBP =0. 
y dy 


Clearly, P = F(a,B,a + 6 +1 —y,y) was a solution of this equation and hence 
F(a, B,a+6+1—y,1—-x) would be an independent solution of the hypergeometric 
equation. Gauss noted that any solution of the hypergeometric equation must be a 
linear combination of these two. He then looked for solutions of the form P = x" P’ 
by substituting this expression for P in the equation. He observed that the equation 
for P’ was of the hypergeometric form when uw = 0 or « = | — y. In the latter case, 
the equation for P’ was 


/ 


dP 
(x — xx) +(2—y —(a+ B+3-2y)x) 
x dx 
(a+1l—y)(B+1—y)P’=0. 


Thus, 


P=x'’F@+1—-y,pt+l—y,2-y,x) 
= (1 —x)¥~-* 8x! -Y FL —a,1 — B,2— y,x) 


would be another solution of the original hypergeometric equation. Observe that 
the last step followed from an application of Euler’s transformation (23.5). It then 
followed that there existed constants M and N such that 


Fa,pBa+tB+l—y,l—x)=MF(a,B,y,x) 
Net ea" PP Sel po ya) 


32 Gauss (1863-1927) vol. 3, pp. 208-223. For an English translation, see Kikuchi (1891) pp. 122-137. 
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Gauss determined after three pages of interesting calculations that 


_T@+é+i-yrd-y) 4, y_ P@+et+i-yry-1) 
P@+i—yrG+i—y) P@r) 


We observe that the case in which a@ is a negative integer was given by Pfaff in 
1797.°3 In this case, the second term is zero because I’ (a) appears in the denominator. 
Gauss remarked that this formula was useful for computational purposes. Clearly, a 
series would converge more rapidly for x between O and 5 than for x between 
and 1. A formula of this type could be applied to convert a slowly convergent series 
to two more rapidly convergent ones. But Gauss cautioned that this formula would 
not be applicable if the series to be transformed was such that the third parameter 
minus the sum of the first two turned out to be an integer. He then went on to show 
that if this occurred, the formula could be modified by the use of his Y function 


and the logarithm. He explicitly worked out the formula for the elliptic integral 


F(5,5,1,1- x). 
Gauss also found solutions at infinity. He set x = + and then P = y"“P’ and 


observed that P’ was hypergeometric when yz = a@ or B. Thus, he obtained P as 


1 
oP (aa tl—y,a+1— 8, ) or tr (ap Bl Sy Bo Ss 
x x 
He then expressed F(a, 8, y,x) as a linear combination of these solutions. 
Gauss derived another general transformation formula by taking x = a in the 


hypergeometric equation and then P = (1 — y) P’, so that another hypergeometric 
equation would be obtained when yz = a@ or f. This gave the necessary result: 


F(a, B,y,x) = ¢ — y)*F(ay — B,y,y) 


=(1—x) °F (a = pr) , 
x—1 

In 1797 Pfaff published this result for the case in which a was a negative integer.** 
Gudermann proved the generalization in 1830. Three years later, P. Vorsselman de 
Heer noted in his thesis*> that Euler’s transformation could be obtained when the 
preceding transformation was applied to itself; Kummer also observed this fact. 

In the published part of his 1813 paper,*° Gauss found the values of the coefficients 
in the expansion 


(aa + bb — 2abcos¢)-" = A+2A'cos¢ + 2A” cos 26 + 2A cos 36 +--- 
(23.47) 


33. Pfaff (1797a). 

34 Pfaff (1797a). 

35 Vorselman de Heer (1833). 
36 Gauss (1813) § 6. 
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in terms of hypergeometric series. He noted that 


1 ST bAr bb 
Oa ("2 ‘) (2) F (mnt pp+ 12) 
a Dp a aa 


_ 1 n+p-1 ab P 
a (aa + bb)" Pp aa+bb 


n+pn+pt+l 4aabb 
x F ; pti, 
2: 2, (aa + bb)* 
1 n+p-1 ab P 1 1 +4ab 
ee —§| ] F(n+p,p+-=.n+-, : 
(a +b)" Pp (a+b) 2 2’ (a+b)? 


(23.48) 


Note that Euler studied the series (23.47) in a 1749 memoir on the perturbation of 
planetary orbits and in 1766 Lagrange found the first series for A) in (23.48).>7 
This series and its coefficients have been studied intensively, both analytically and 
numerically, and Gauss’s interest in them was evident. If we take a = 1 and x = b?, 
the second equation in (23.48) gives 


1 4 
Geer Gn Rape ee, ee ee 
Cie x)? 


gi 2 , 
(23.49) 


This is an example of a quadratic transformation because the variable on one side 
is x, or it could be a fractional linear transformation of x, while the variable on the 
right-hand side involves x7. It is very likely that equation (23.49) led Gauss to study 


such transformations in the second (unpublished) part of his paper. He set x = aay 


in the hypergeometric equation and then P = (1 + y)?“Q to find that the equation 
satisfied by Q was 


Q 
» 


d 
(l+y)Gy — y’) ; + (y — (4B — 2y)y + (vy — 4a — 2)y”) 5 


2a(2B —y +(2a+1—y)y)O=0. 


Now note that when 6 = a + 5,1 + y is a common factor in this equation. We 
remark that equation (23.49) guided Gauss in the substitutions for x, P and B. We 
next have Q = F(2a,2a + 1-—y,y,y) and finally 


1 4y 


37 Dutka (1984) p. 22. 
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23.9 Kummer’s 1836 Paper on Hypergeometric Series 


Kummer independently rediscovered Gauss’s unpublished results on hypergeometric 
functions, including the quadratic transformations. Of course, he was familiar with 
Gauss’s published paper and with the work of Euler, Pfaff, Jacobi, and Gudermann on 
this topic. Kummer took a general approach. He set out to determine all functions of 
zand w of x such that y = wF(a’, B’, y’,z) satisfied the equation 


y+ (y—-(@t+B+D)x)y’ — apy =0 


and a’, 6’, y’ were linear combinations of a, 8, y.°® He found z to be a fractional linear 


transformation axtt and that w could be taken to be 


Be et GL ae pe oN 


Specifically, z could be any one of the six fractional linear transformations serving 
to permute the values 0, 1, and oo. These would be 


1 1 x x—-1 
2=*, CSL Roca SE 5 i 
XxX 


x—-1 x 
When z = x, he obtained the four forms 
F@,B.y.x), (=x)! * 8 Fy =a,7 = B.y;%)s 


xi’ F@—y+1,p—y+1,2—-y,x), 


x baat Pro 6,2 =a); 


Thus, he obtained twenty-four solutions of the hypergeometric equation and 
determined the linear relation among any three of them. 

Kummer may have become interested in quadratic transformations after studying 
Gauss’s published equation (23.48). His interest in elliptic integrals may have provided 
him with further motivation to study these transformations. It was clear to Kummer, 
as it was to Gauss, that quadratic transformations existed when the parameters a, B, y 
in F(a, B, y,x) satisfied certain relations. So Kummer considered the linear relations 
among the parameters leading to such transformations. In this way, he rediscovered 
Gauss’s results as well as new ones. For example, he obtained 


F(a, B,2B,x) = (1 _ x)bre (1 x ae 


a 2B-a+l pees Ne 
(5 a a 0+3(55) J. 


38 Kummer (1836). 
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Note that by applying Euler’s transformation to the right-hand side, we get the 
simpler form 


F > tifj x a atl 1 x 2 
up.28.09= (1-5) “F (5S p45. (zh) J. 


This and Gauss’s transformation (23.50) are the two basic quadratic transformations; 

from these, the others can be obtained by using fractional linear transformations or the 

three term relations among the different solutions of the hypergeometric equation. At 

the end of his paper, Kummer commented on the more general hypergeometric series 
4 OP =k, : OE Ls BBE La) ee 


l-y-v 1-2-yyvt+1)-v@+1) 


He wrote that he was unable to obtain general transformation formulas for this 
function, although he had several for the case x = 1. As an example, he presented 


CO [o,@) 


y- (ae (BRAK 7 Pw)rwt+y—-—a—p-—A) - (vy —a)e(v — BoRAdk 
Myke. Foe-aroty-a—-p) AH khykwty—a— By 
(23.51) 


k=0 k=0 


He observed that, in general, this series could not be summed in terms of the gamma 
function, but when A = | and v = 2(a + 6 — y + 1), then its value would be 


(a+ B-y+)Diy 2 (He 1) 
(a-y+1)(6-yr+l) P(a@)T(B) 


Recall that Stirling discovered a particular case of Kummer’s transformation where 
4 =1landv = 6+ 1; see (10.41). 


(23.52) 


23.10 Jacobi’s Solution by Definite Integrals 


Euler gave a method of solving differential equations using definite integrals. He 
applied it to solve several second-order differential equations, including one related 
to the hypergeometric equation. Jacobi worked out the specific details of the method 
for the hypergeometric equation and showed how to obtain the twenty-four solutions 
of Kummer.*? Jacobi started with the observation that for 


V =u! — YF! — xu), (23.53) 


d*Vv dV d (“= ) 
x(1 — x) t(y—(a+ Be Was apV =—a V 


dx? du\ 1—xu 


=e ow)’ $day o 


39 Jacobi (1859). 
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Hence, y = ie V du would be a solution of the hypergeometric equation for B > 0 
and y — B > 0 because 


1 


1 = _ 1 
-af'a(“ Oy) = mae YY] aha wr =u] si 
0 


l—xu l—xu 0 0 


The expression u? (1 — u)”~F (1 — xu)~*~! also vanished at u = -+too when 
y—a—1 <0. Soif g andh were a pair of the values 0, 1, +00, then, Jacobi observed, 
the integral y = fe V du would be a solution of the hypergeometric equation under 
suitable conditions on a, B,y. 

Jacobi also considered a solution of the form y = i 3 V du where € was a constant. 
When this y was substituted in the hypergeometric equation, Jacobi obtained 


—(y — B— 1c? (1 — Ox 7 — 6)” FP? + ag — 9)” Fl — xg), 


1 
The expression involving € vanished for € = | when 1—a > 0, sofor y = [* V du 
to be a solution, Jacobi required that 1 — a > 0. Taking x to be positive, Jacobi had 
the six solutions: 


y= fe V du, when f and y — f were positive; 
* y= fy ~ V du, when B and w + 1 — y were positive; 
* y= PV du, when y — B and a + 1 — y were positive; 


L 
* y= fo V du, when f and | — a were positive; 


y= fr V du, whena + 1 — y and 1 — a were positive; 
1 
* y= f;' V du, when y — f and | — a were positive. 


Jacobi then noted that the integral is u*(1 — u)#(1 — au)” du was in fact a 
constant times the series F(—v,A + 1,4 + «4+ 2,a). This series could be derived by 
expanding (1 — au)” by the binomial expansion and then performing term-by-term 
integration. Also, note that the last five integrals could actually be obtained by a 
suitable substitution in the first one. For example, to go from ve V du to da V du, 
set u = a Then 


aa : x-1\% 
y= / Vdu = pes f pe Vase! (1 —v ) dv. 
0 0 x 


The corresponding hypergeometric series would be 


x—1 
F (aa +1 ya+t+B+l-y, i; 
x 
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In this manner, Jacobi represented the six integrals as hypergeometric functions: 


* F(a,f,y,x), substitution u = v; 


° x F(a,a l-y,a+pt+l y=), substitution uv = 2=!; 


v”? 
1. 


> 


° x “F(a, l-y,a+1 B.%); substitution u = | 
° x *F(B,B l-y,B+l-a 1), substitution u = 2; 


ia 4 


ex'-’Fa+l—y,B+1—y,2—y,x), substitution u = s; 
+81 Py al ay +1 a 8,42}), 
I 


x+(1—x)v* 


substitution u = 


Jacobi then observed that, other than the identity, the fractional linear transforma- 
tions mapping 0, | to itself could be given as 


v l-—v 
u=l—-v, u 


= ——, u= : 
l—x+0x 1 — vx 


Then V du was, respectively, 
—a 
(l—x)~*v”-P 1 — pF! (1 aoe ) dv, 
x 


a-—y 
(hax aay (1 — =) dv, 
(LS xy Py Fo a) Sx)? a: 


Observe that we have y = aK V du as aconstant times each of the four expressions: 


FlaBry.x) = (139 F (ay ~ far e ) 


x—1 


= X 
=(1—x) 8 F (v ~ a, p.y.—*—) 


= (1—x)”* F(y —a,y — B,y,x). 


Similarly, there are four expressions with each of the six integral solutions, yielding 
Kummer’s twenty-four solutions. 


23.11 Riemann’s Theory of Hypergeometric Functions 


Kummer showed that the twenty-four solutions of the hypergeometric equation could 


: é if 1 1 1 1 eae 
be expressed as hypergeometric series in x,1 — x, 5,1 — 3, 7, Tan multiplied 
x 


by suitable powers of x and/or 1 — x. He also gave the relations among any three 
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overlapping solutions. In a paper of 1857,4° Riemann reversed this process, starting 
with a set of functions with three properties; these properties in turn uniquely 
determined the functions up to a constant factor, as well as the differential equation of 
which these functions were the complete set of solutions. He denoted by 


a boc 
Pia B y x 
a’ Bp’ y’ 
any function satisfying the three properties: 


¢ For all values of x except a, b, c, called branch points, P was single valued and 
finite. 

* Between any three branches P’, P”, P” of this function, there was a linear 
homogeneous relation with constant coefficients, 


C’P'+C°P" + CP” =0. 
¢ The function could be written in the form 
CoP" + Cy PX, CoP? 4 Cy P®?, C,PY +P”, 
where Cy, Cy’, «+» C, were constants and 
(x — a)? P™, (x — a) P@ 


were single valued near x = a and nonvanishing and finite at x = a; a similar 
requirement would hold for 


(x —b) FB P®, («x —b) FP P®) at x=b 
and for 


(x —c) 7 P™, (x — cy’ PY) at x=c. 


Moreover, a—a’, B—’, y—y’ were not integers anda+a’+B+f’+y+y’ =1. 


It follows immediately from the definition of P that if x’ is a fractional linear 
transformation of x mapping a, b, c to a’, b’, c’, then 


a bc GBs ie 
Pia B y xt=Phia B y x’ (23.54) 
a’ p’ y! a’ Bp’ y’ 


Here recall that every conformal mapping of C U {00} is of the form 


eae 
eee a where Av — 6 = 1. 
bx +v 


40 Riemann (1990) pp. 99-115. 
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We can therefore choose a’, b’, c’ to be 0, 00, 1. It is also clear from the definition 
of the Riemann P-function that 


gene a be Cc 
(at) ra py sbertats na y oth 
~ Ce a a a +d Bl-s y 
0 ow 1 
x(l—x)Pia B y x 
a’ p’ y! 
0 oo 1 
=Ptya+d B-db-€ ye x}. (23.55) 
a’+5 p’-b-eE y'+e 


Following Riemann, we write P € : y «) when the firstrowisO co 1. 


We may immediately write the relations 
0 a 0 _ “A 0 a 0 x 
oe b c-—a-—b x)=a-%) ae c c—b b-a +) 
0 a 0 1 
— 74 i 
ae (, a 1-c+b c-a-b -) 


ee 0 c-—a 0 
a Gina aka) 


Riemann also studied contiguous relations satisfied by the P-functions. Following 
Gauss, he used these relations to find the differential equation satisfied by P. In fact, 
Riemann worked out the details only for the case y = 0, sufficient for his purpose. 
Felix Klein’s student Erwin Papperitz (1857-1938) presented the general case in 1889. 


Riemann found that the equation satisfied by P @ p j a | was 


d’y 
dlog x? 


d 
(1—x) (a +a! + (B+ Bx) + (ae! — BB'x)y =0 
og x 


from which he quite easily showed that 


0 a 0 
Flabous) =const.P(, ° areas ears v). (23.57) 


Moreover, the Pfaff and Euler transformations follow from (23.56) and (23.57). 
Riemann’s work on the hypergeometric equation led to important developments in 

the theory of linear differential equations. Riemann himself foresaw some of these 

developments, though he did not publish his ideas. In 1904, James Pierpont wrote 
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about this aspect of nineteenth-century mathematics:*! “A particular class of linear 
differential equations of great importance is the hypergeometric equation; the results 
obtained by Gauss, Kummer, Riemann, and Schwarz relating to this equation have had 
the greatest influence on the development of the general theory. The great extent of the 
theory of linear differential equations may be estimated when we recall that within its 
borders it embraces not only almost all the elementary functions, but also the modular 
and automorphic functions.” 


23.12 Exercises 
(1) Verify (23.52). 


2 
(2) Show that y = {ares xy satisfies the differential equation 


d*y dy 
dx? dx 


(1 — x?) 


os (n! ig 2n+2 


ae ee 
Deduce Takebe’s formula —=(arcsin x)“ = ) ——__. 
2 (2n + 2)! 


n=0 


Prove Clausen’s 1828 observation that this formula is a particular case of 
his formula (23.13). See Clausen (1828). Also see Eu. 14 pp. 156-186 and 
the correspondence of Euler and Johann Bernoulli on this topic: Eu. 4A-2 
pp. 161-262. 

(3) Prove the following examples mentioned in Gauss’s 1813 paper on hyperge- 
ometric series. 


‘ . 1 1 1 1303 
sinnt =n sint F n- n+-—,=, sin* t}; 
2 22 


2 32 
F ; 1 1 a) 
sinnt =n sint cost F n+1, n+1,-=, sin’ t }; 
2 2 2 
1 1 1 ., 
cos nt = F | =n, — =n, =, sin’ t ); 
2 22 


1 1 1 11 ., 
cos nt =costF n+-, n+-,-,sin° t}). 
2 2 2 2) 2: 


See Gauss (1863-1927) vol. 3, p. 127. 
(4) Show that 


41 Pierpont (2000). 
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24 34 642, 34 


1-2 1. 
sin t cos ¢ +5 sin tsa t 57 Sin t 3 sin? t 


£3 sin 
1- 1- 1- 1- 1- 


See Gauss (1863-1927) vol. 3, pp. 136-137. 


(5) Show that 
9 _3 5 19 
F (24.5.8) =U-9) F355 x). 


Gauss stated this without proof in the Ephemeridibus Astronomicis Beroli- 
nensibus 1814, p. 257. He gave a proof in his unpublished second part of his 
paper. See Gauss (1863-1927) vol. 3, p. 209. 

(6) Set x = 1 -— y in the hypergeometric differential equation and from its 
form deduce that F(a,b,a + b+ 1-—c,1 — x) is another solution of the 
hypergeometric pada See Gauss (1863-1927), vol. 3, p. 208. 


(7) Show that when x = 


—_, 


= 4, the hypergeometric equation changes to 


a’? F dF 
(l—y)@— yy + — e+ @+b—c— Dy) + abF = 0. 
y dy 


In this equation set F = (1 — y)“G to show that G satisfies 


dG dG 
2 | | | 
Wyo 2 ye Se er Ody 2) 


+((ab — ula +b —c — ly) + (wu? — w)y)G = 0. 


Show that when ~ = a or = B, then the coefficient of G is divisible by 
(1 — y) and thus deduce that 


F(a,b,c,x) = (1 —x)7 °F (ae—bez —). 


This is Gauss’s proof of Pfaff’s transformation. See Gauss (1863-1927) vol. 3, 
pp. 217-218. 
(8) Set x = 4y — 4y?. Show that the hypergeometric equation takes the form 


a F | | | | | 2 
(y YS pp + (c — (4a + 4b + 2)y + (4a + 4b + 2)y*) 


1 dF 
— —4abF =0. 
oer dy 


Next, show that the fraction in the middle term is removed by putting c = 
at+b+ 5. Also deduce that 


1 1 
F (a.b.0+0+ ay ~4*) =F (20,2044 =): 
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It was by this example that Gauss illustrated the multivaluedness of the 
hypergeometric function. See Gauss (1863-1927) vol. 3, pp. 225-227. 

(9) Prove Kummer’s transformation (23.51) and its corollary (23.52). Kummer 
stated these formulas without proof at the end of his 1836 paper. See 
Kummer (1975) vol. 2, pp. 75-166. 

(10) Prove Euler’s continued fraction formula 


Bx 2Fi(-a, B+ 1; y +1, x) 
Y 2F\(—a; B; y; —x) 

Bx (B+)I(@atyt+1)x 
—(@+B+1)x yt+l—-@+B+2)x 
(B+2)(a+ty+2)x | 
y+2—@+B+3)x 


See Eu. I-14 pp. 291-349. 
(11) Prove Ramanujan’s integral formula 


ij x° (0) — 6)x + ¢(2)x? —---)dx = o(—s). 


sin sw 


See Berndt (1985-1998) part I, pp. 295-307. See also Hardy (1978) pp. 186- 
190; he relates this formula of Ramanujan with a 1914 interpolation theorem 
of F. Carlson, useful in proving hypergeometric formulas. 

(12) Use the following outline to determine when the square of a hypergeometric 
function 


- = (@)n(B)n at dakes-the fom 2 = ‘s (@)n(B nn ah 
Mi (Yn OU €Dn 


(a) Show that when the hypergeometric equation is multiplied by x and then 
differentiated, the result is 


>, By d? 
(w+ B +4)x? — (y 4 2)x) 5 


d 
0 +28 + a6 +2)x y) > + apy = 0. 


(b) Show that z satisfies the differential equations 


a 
(8 — x) F + (G+al +p +8)x°- ty +e) 


d 
t(1 +o’ + Bp’ +8’ +a'p’ +a'5' + p’s')x — y'e’) a t ot’ B'5’z = 0. 
Xx 
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(c) Show that if z = y’, then the equation in (b) becomes 


d’y d*y 
3 2 | | 2 | 
(er 3" )2y aa (B+a')x" —(14 d)x)2y 5 
/ ! / dy bed 
(i+a@ +.3)x-e)2y bey 
dx 
dy dy dy\? 
60 25) Oh ( Ge oy mena Oe a —) =0, 
(3 — x9) 2.2 42G +ax?- 1 +d)x) ( = 
where a = ao + B' 4+ 85,b' = a/p’ +5 + B's'",c = a’ P's’, 


Goes y’, e- ely’. 


yA 


(d) Multiply the hypergeometric equation by oyA +B 7 and equation (a) by 
2y and add the two equations. Compare the resulting equation with (c), 


and deduce that 


1 
y=atB+5,A=20+26—1, B= a’ =3a + 38, 


b! = 2a? + 808 + 2B", c 


‘=4(a+ B)aB, d’=3y —1,e’ = Qy — ly. 


(e) Deduce that a’ = 2a, 6’ =26,6'=a+ 8, y'’=y,¢'=2y—-1. 


(f) Conclude that 


OE ps 
See Clausen (1828). 


oon 


2a,28,a + B - 
a+B+ 5,20 + 2p’ 


). 


23.13 Notes on the Literature 


For a history of the hypergeometric series, see Dutka (1984). The reader may 


read more on Li Shanlan and other 


Chinese mathematicians in Martzloff (1997), 


pp. 341-350. Discussions of Gauss’s convergence test are available in Bressoud (2007) 
and Knopp (1990). In 1859, perhaps influenced by the work of Jacobi, Riemann gave a 
course of lectures in which he defined the P-function by means of a complex integral. 


See Riemann (1990), pp. 667-691. 
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Orthogonal Polynomials 


24.1 Preliminary Remarks 


Orthogonal polynomials played an important role in the nineteenth-century develop- 
ment of continued fractions, hypergeometric series, numerical integration, and approx- 
imation theory; in the twentieth century, they additionally contributed to progress in 
the moment problem and in functional analysis. However, orthogonal polynomials 
may not have received recognition proportional to their significance, leading Barry 
Simon to dub them “the Rodney Dangerfield of analysis.’!* Nevertheless, when Paul 
Nevai edited the proceedings of a 1989 conference on this subject, he stamped on the 
dedication page, “I love orthogonal polynomials.”* 

A sequence of polynomials p,(x), n = 0, 1, 2,..., is said to be orthogonal with 
respect to a weight function w(x) over an interval (a,b) where -co <a <b< «ow, if 


b 
/ Pn(X) Pm (x) w(x) dx = Andmn, (24.1) 


where A, 4 0. In a paper on probability written in the early 1770s, Lagrange defined 
a sequence of polynomials containing as special cases the Legendre polynomials. 
Denoted by P,(x), the Legendre polynomials are obtained when a = —1, b = 1 and 
w(x) = 1, in (24.1). Lagrange gave a three-term recurrence relation for his sequence 
of polynomials; for the particular case of Legendre polynomials, this recurrence 
amounted to 


(2n + 1)x Pa(x) = (n + 1) Pay i (x) + nPy_-1 (x), (24.2) 
n=1,2,3,...,  Po(x)=1, Pi(x) =x. 


In a paper of 1785* on the attraction of spheroids of revolution, Legendre defined 
the polynomials now bearing his name by the expansion 


! Simon (2005). 
2 Simon (2005). 
3 Nevai (1990). 
a Legendre (1785). 
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(1 —2cosOy + y2)-2 = 1+ Pi(cos6)y + P2(cos@)y? + P3(cosO)y3 +++. 
(24.3) 


In this memoir, Legendre needed only the polynomials of even degree and he 
explicitly presented P2(cos 0), Pa(cos@), Pe(cos@), and Pg(cos @). We note his first 
two examples: 

P2(cos 0) = eer. — = Pa(cos@) = aol cos* 6 aa 2cos?@ + weed 
2 2 2-4 2-4 2-4 
(24.4) 


In the second volume of his Exercices de calcul intégral of 1817,° Legendre gave 
the orthogonality relation and an expression for the general P,, (x). Legendre polyno- 
mials played an important role in the celestial mechanics of Laplace, Legendre, and 
others. 

Gauss used Legendre polynomials in his paper on numerical integration, presented 
in Gottingen in 1814,° extending the work of Newton and Cotes. But Gauss did not 
refer to the earlier work on these polynomials; rather, he conceived of Legendre poly- 
nomials as an outgrowth of his work in hypergeometric series. The groundbreaking 
approach and methodology taken by Gauss in this paper led to important advances 
in, nineteenth-century numerical analysis. Briefly summarizing Gauss, we suppose 


fe y(x) dx is to be computed. Let points a(= ao), a1, ...,@, be chosen in [c,d] and 
let the corresponding values of y at these points be yo, yi, ..., Yn. Set 
f(x) = (x — a)(x — ay) (X — a2)... (4 — ay). (24.5) 


Note that the nth degree Lagrange—Waring polynomial 


n 


2 Sf (X) Yk 
cae dX Fiax)(x — ay) sai 


passes through (ax, yx), k = 0, 1, ...,n, and therefore interpolates y(x); thus, we may 
write y(x) = Z,(x) +7rp(y). Then 


d n 
/ y(x)dx = S° Any + Rn(y), (24.7) 
. k=0 
where 
= fF xydx _ 4 
dk -[ FiO ap) (ax) (x _ an and Rid) — | MQ) dx. (24.8) 


It is clear that if y(x) is a polynomial of degree < n, then y(x) = Z,(x) and 
hence R,(y) = 0. In the Newton—Cotes scheme, the points ao, a1,...,a, were 


5 Legendre (1811-1817) vol. 2, pp. 249-250. 
© Gauss (1863-1927) vol. 3, pp. 163-196. 
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equally spaced. Gauss considered whether he could prove R,(y) = 0 for a larger 
class of polynomials by varying the nodes a, aj, ...,d,. Since there were n+ 1 points 
to be varied, he wanted the class of polynomials, for which R,(y) = 0, to consist of 
all polynomials of degree < 2n + 1; indeed, he succeeded in proving this. In short, his 
argument began with the observation that 


1 Le,  Rae®) 
&(——)=& (45+ 54) => ys a (24.9) 
k=0 
His problem was to choose a, aj, ...,d, so that R,(x*) = Ofork = 0,1,..., 
2n + 1; then he could write 
R : =O : 24.10 
n rae => p23 . t—> ©. ( * ) 


When [c,d] = [—1, 1], from his results on hypergeometric functions, Gauss had 


1 1 2 3 
ee ay ea 2 og Beas Sar aii) 
_jt—x 1—} t—t t t : ; 


He also knew that the (n + 1)th convergent of this continued fraction was a rational 


function pi, where S, was of degree n and P,,,, of degree n + 1. Moreover, this 


rational function approximated the continued fraction up to the order t~7”~3. So Gauss 


factorized P,+41(t) and wrote pe ah as a sum of partial fractions: 
n 


Sn) _ ae (24.12) 
Py4i(X) = ; ; 


He then easily showed that by using these a, a1,...,d, and A, Aj,...,An, he 
would obtain the result. Gauss explicitly wrote down the polynomials P,+1(x) for 
n = 0,1,2,...,6; we can see they are Legendre polynomials of degrees | to 7, 
although Gauss did not make this observation. Instead, he gave the hypergeometric 
representations of the polynomials P,,4 1 and of the remainder 


1+; S(t) 


In i : 
| er Pn+i(t) 


At the end of the paper, he computed the zeros of the Legendre polynomials of degree 
seven and less with the corresponding A. He used these results to compute the integral 
f & over the interval x = 100000 to x = 200000. Note that Gauss was well aware 
that ie re gave a good approximation for the number of primes less than x. 

In a paper of 1826, Jacobi pointed out that Gauss’s proof ultimately depended on 
the orthogonality of P,4)(x).’ To see this, suppose y(x) is a polynomial of degree at 


7 Jacobi (1826). 
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most 2n + 1. Then y(x) = g(x) Py41(*) +r (x), where g(x) and r(x) are polynomials 
of degree at most n. Next note that 


1 1 1 
i, y(x)dx = q(x) Pn41(x)dx +f r(x)dx, 
-1 -1 -1 


where the first integral on the right-hand side vanishes by the orthogonality of 
P,+41(x), and the second integral can be exactly computed by the Newton—Cotes 
method because the degree of r(x) is not greater than n. In fact, Jacobi did not start his 
reasoning process with the Legendre polynomial; at that time he may not have known 
of the earlier work of Legendre, Laplace, and others on Legendre polynomials. His 
argument produced these polynomials, their orthogonality, and the byproduct that 


n 


2 
maa (24.13) 


Pr(x) = 


Interestingly enough, Rodrigues and Ivory had already independently discovered 
this useful and important formula for Legendre polynomials.’ In 1808, Olinde 
Rodrigues enrolled in the Lycée Impérial, later named Lycée Louis-LeGrand and 
where Galois also studied. After graduating in 1812, he was admitted to the Université 
de Paris, submitting a doctoral thesis on the attraction of spheroids in 1815. Unfor- 
tunately, the haphazard journal in which his memoir on this subject was published 
produced only three volumes from 1814 to 1816. This partly explains why Rodrigues’s 
work and the formula for Legendre polynomials, in particular, were not noticed. For 
several decades, the result was referred to as the formula of Ivory and Jacobi. In 1865, 
Hermite finally pointed out Rodrigues’s paper; Cayley referred to it in a different 
context in 1858. 

James Ivory (1765-1842) was an essentially self-taught Scottish mathematician 
whose interest was mainly in applied areas. He received much recognition, but perhaps 
suffered from depression, curtailing his career; in a letter to MacVey Napier he 
declared, “I believe on the whole I am the most unlucky person that ever existed.”? 
Most of Ivory’s inspiration was drawn from the work of the French mathematicians 
Laplace, Legendre, and Lagrange; his papers contain several references to Laplace’s 
book on celestial mechanics. Ivory published (24.13) in a 1824 paper on the shape of 
a revolving homogeneous fluid mass in equilibrium; he derived the formula from a 
result in his earlier 1812 paper on the attraction of a spheroid. In his 1824 paper, Ivory 
remarked on the formula, “From this very simple expression, the most remarkable 
properties of the coefficients of the expansion of re are very readily deduced.”!° Here 


f refers to the expression (1 — 2.cos 0y + y2)2. 

The Irish mathematician Robert Murphy (1806-1843), mentioned in Chapter 21, 
had a brief mathematical career during which he published papers on integral 
equations, operator theory, and algebraic equations. He was perhaps the first to 


8 Rodrigues (1816); Ivory (1824). 
9 Craik (2000). 
10 Tory (1824). 
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understand the significance of orthogonality; in a series of papers on integral equations 
in the early 1830s, Murphy considered the following problem: Suppose 


1 
on) = | rf(t)dt, x=0,1,2,.... 
0 


Determine the function f(t) from the function ¢(x). One of the simplest results he 
stated in this connection was that if ¢ (x) was of the form 4 + £ + S +---, then f(t) 


would be given by t multiplied by the coefficient of + in d(x) -f~*. As an extension 
of the previously stated problem, Murphy considered the determination of f(t) from 
a knowledge of ¢(x) for a finite number of values of x, say x = 0,1,...,n — 1. 
The simplest case is when @(x) = 0 and this leads to the Legendre polynomials as 
solutions for f(t). Murphy called such functions reciprocal rather than orthogonal. He 
also considered cases where t was replaced by In ¢, and this led him to the Laguerre 
polynomials 


e# n 


ni! du” 


Ty(u) = (u"e"), n=0,1,2,.... 

He proved their orthogonality and found their generating function by applying the 
Lagrange inversion formula. Recall that a century before this, D. Bernoulli and Euler 
had studied these Laguerre polynomials; Bernoulli computed the zeros of several of 
them by his method of recurrent series. See Exercise 3 in Chapter 14. 

It may be fair to say that Pafnuty Chebyshev was the creator of the theory of 
orthogonal polynomials and its applications. In an important paper of 1855, Cheby- 
shev introduced and studied discreet orthogonal polynomials.'! This and later papers 
were associated with the areas of continued fractions, least squares approximations, 
interpolation, and approximate quadrature. Later in his career, Chebyshev’s excellent 
students, including A. A. Markov and E. I. Zolotarev, continued his work in these and 
other areas. 


24.2 Legendre’s Proof of the Orthogonality of His Polynomials 


In his Exercices, Legendre used the generating function for Legendre polynomials to 
offer an elegant and short proof of their orthogonality.!* Note that Legendre denoted 
P,(x) by X”, but we use the more modern notation. He started with the generating 
function 


[oe] 
1 
Qs2y+y) 25 >) Py", 
n=0 


!l Chebyshev (1899-1907) vol. 1, pp. 203-230. 
!2 Legendre (1811-1817) vol. 2, pp. 224-232. 
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where |x| < 1 and |y| < 1, and considered the integral 


1 oo oO m 
(= / ( pions (>: fess | dx (24.14) 


1 n=0 m=0 
dx 


[ 
“1 Ja -2xry + r2y2d — 2 4 4) 


He set 
1+ pry? 2 
x = 
2ry 


to obtain 


1 ee dz 
T= 
2Siary J(2 —14r2 4+ y2 — Py?) 


l-ry 
=1n( r+ 2 1+r2+y? Py) 
1+ry 
1 
=In(-Il +ry+r-—y) Ps a ea y) 
Lay a nae ea 2 ania... 
engage ae ee gos ee 


Comparing this expression with the integral (24.14), he obtained orthogonality: 


1 
2 


He also used the generating function to obtain 


1-3-5---(Qn—1) 1-3 ts (2n-3). 27°? 
P, (x) = i fe acting 24.16 
ne) (3m. Ven. 2 Ce10) 


24.3 Gauss on Numerical Integration 


Gauss started his 1815 paper! with a discussion of the Newton—Cotes method for 
numerical integration. Let [0, 1] be the interval and let a, a1, a2, ...,a, ben+1 points 
in that interval. Set 


{OH[]6— Oe — a) @— a) .0. 4 — a4) 
ee ie Shor aide oy) aaa Seem ee (24.17) 


13 Gauss (1863-1927) vol. 3, pp. 163-196. 
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Let y be a function to be integrated over [0,1] and let y(= yo), y1, y2,..-,¥n be 
its values at a = do, a1, a2, ... ,dy,, respectively. The Lagrange—Waring interpolating 
polynomial of degree n for y is then given by 


f(x)y “. f(x) ye f(X)yK 
= — 24.18 
80) = F@a =a) d Frau) (a — a) = Fanaa) On) 


The Newton—Cotes method then consists in integrating the interpolating 
polynomial: 


1 n 
/ ydt =~ dnye + Ruy) (24.19) 
Q k=0 
where 
1 
d 
ee Te as eee (24.20) 
0 f/(ak)(% — ak) 
and Ry, (y) is the remainder. This remainder is zero when y is a polynomial of degree 
at most n. Gauss asked whether it was possible to choose a, a1, ...,@, in such a way 
that the remainder would be zero for polynomials of degree at most 2m + 1, with 
the points a, a1, ...,d, no longer equally spaced as in the Newton—Cotes procedure. 


Gauss observed that since 
f(a) = 0, he had 


fa). steal ei @ ye ea) 
x—a x-a 
= xt pe gt 4 yt age ee .4+q" 


ee bee? a ee ca 


Hence, after rearranging terms, 


1 
LO Gy =a" +ca"!4+ ca? +--+ ey 
(= 
1 n—-1 n—2 
54 c\a +++ Cp—1) 
! n—2 n—3 
34 cia + +++ + Cp—-2) 
+ -(a+c}1) 
1 
if (24.21) 
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Here Gauss noted that the nonnegative powers of x in the product 


1 1 1 1 
a+ WF ok (eee aa Inf{l—— 
Ce CIA bit) € 5,2 + 3,3 ) f(x) In ( ~) 
(24.22) 


gave the terms on the right-hand side of (24.21) when x = a. So he could write 
1 
— f () In (1 = -) = T\(x) + To(x), (24.23) 
x 


where 7} (x) was the polynomial or principal part of — f(x) In ( 1- +); then, by (24.20), 


1 
Ti(a,) = i OR rea, te cae er ey (24.24) 
0 Xx — ak 


Denoting R, (x) by km, Gauss used (24.19) to obtain 


n 
1 
Apa,” = —— —k =0e1,2) 205. 24.2 
> kay m+1 mM» m 0, 1m) ( 5) 
k=0 
It followed that 
n n n 2 
Xr Xr =1 Xr Xr Xr 
ee =>) ky Mae | Mee -) 
otk yt x rare x % 
1 1 1 1 1 1 1 
=(1-k k k k pete 
oe. (; ) x2 (; 2) x3 ( » x4 
1 kn 1 kn 2 
=-n(i-+)- (43+ 484-.), (24.26) 
where Gauss used the fact that k; = Ofor j = 0, 1, ...,”. Gauss then had to determine 


conditions on f(x) so thatk; = Ofor 7 =n+1,n+2,...,2n as well. By (24.23) 
and (24.24), he could deduce that 


a 
T(x) = f@) >> 
k=0 


x — ak 


This was possible because both sides were polynomials of degree n, equal for n + 1 
values of x, given by ao, a1, a2, ...,dn. Thus, by (24.23) and (24.26), it followed 
that 


+ te ) = T(x). (24.27) 


knti 
fa (= Det xn 
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Gauss used this analysis to find f(x) of small degrees. For example, when n = 
0, f(x) = x + cy, he had to consider 


Gone ie 2 
Pe UnNe \ieg Haaeoy Ye? : 


For the coefficient of + to be zero, he required that c, + 5 = Oorc, = —5. For 
n=1,f@®%= x? + cix + cp, and the coefficients of + and 5; in the expansion 


— f(x) Ind — 1) then had to be zero. He could then write the equations for c; and c: 


ate : + : 0 d : + : + : 0 
= =~=0 and — = —~=0. 
ae ae ad ase 

Thus, c, = —1 and cy = } and so the polynomial was x7 — x + é Gauss then 
changed the variable so that the interval of integration became [—1, 1]. In this case, he 
had to choose the polynomial U (x) of degree n + 1 so that 


SUC) te -ue(t+ 44 ax ) = vie + u00 
2 ne anak ON 3x2 7 5x3 eas a 
had appropriate negative powers of x with zero coefficients. In fact, the zeros u of U 
were related to zeros a of f by u = 2a — 1; U; and U2 corresponded to 7; and 7> of 
equation (24.23). With this change of variables, the polynomials for n = 1, 2, 3 were 
$005 } and x? — 3E Note that these are the Legendre polynomials of the first three 
degrees, normalized so that they are monic. 

Gauss proceeded to give a method using continued fractions in order to quickly 
determine the polynomials f(x). From his paper on hypergeometric functions, he had 
the expression 


ie ie 
g(x) = 5 In = 3. 38 37... (24.28) 
2 x-1l x-x-x-x 
He then showed that if the nth convergent of his continued fraction was i a , then 


G@)— Py (x) _o 1 
p(x On (x) 73 (so): 


From this he could conclude that if Q,(x) was monic, then 
Onsi(X) = U(x) and Py4i(x) = Ui (x). 


In this manner, Gauss completely solved his problem. Observe that the points 
ax, k =0, 1,...,n are the zeros of the Legendre polynomials and that the numbers A, 
could be obtained from 
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Pat i(x) wr Ag 
Onti(x) =e x — ax’ 


k=0 


so that Ay = oa 
nt 


24.4 Jacobi’s Commentary on Gauss 


In the introduction to his 1826 paper!+ on Gauss’s new method of approximate 
quadrature, Jacobi remarked that the simplicity and elegance of Gauss’s results 
led him to believe that there was a simple and direct way of deriving them. The 
object of his paper was to present such a derivation, making use of his work 
from his doctoral dissertation on the Waring—Lagrange interpolation formula. Jacobi 
proceeded, in his usual lucid style, to show that Gauss’s numerical integration method 
was effective because of its use of orthogonal polynomials. Abbreviating Jacobi’s 
work for convenience, suppose $(x) = Ths ,(* — x;), where the x; are distinct and 
suppose f (x) is a polynomial of degree < n — 1. Then 
ff) Al A2 An 


O(x) xX—-xX, x—X X—Xy’ 


where 


, = 2) F() - (X- xR) fx) — fxr) 
im ———— = lim = . 
YX p(x) xx P(X) — P(x) O'(Xx) 


So if x1, x2, ...,X, are the interpolation points, then any polynomial f (x) of degree 
at most n — 1 can be expressed by the formula 


ne » ee, 


(x )(x — xK)’ 


attributed by Jacobi to Lagrange. The integral of such a polynomial is given exactly 
by the Newton—Cotes formula. On the other hand, if the degree of f is greater than 
n — 1, then divide f(x) by (x) to get 
x U(x 

$O) _ yay, UO 

p(x) p(x) 
where U(x) and V(x) are polynomials and the degree of U is less than or equal to 
n — 1. Now assume with Jacobi that 


f(%) =atayx +anx* +++ +ayx" 4 Bate eee Alp eet eee 


14 Jacobi (1826). 
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and 


1 Ai A2 An+1 


oa) xm) emer | 


x 2n 
Then 


V(x) = an Ai + G41 (Aix + Az) + Gng2(Aix? + Aox + A3) +-+: 
+ dan (Ayx"! + Agx”™* -0- f An) bees 


Jacobi observed that according to Newton’s method, to compute E , f(x) dx, one 
would substitute U(x) for f(x) and the error would be 


a= | rooax— fuar= f ovas. 


He then noted that the expression for V did not involve a1, a2, ...,@,—1 and hence 
the error, A, would be independent of these coefficients of f. The question was 
whether @ could be chosen so that the error would be independent of ay, an41, etc. 
Clearly, if f px* = 0, fork = 0,1,...,/, then A would also be independent of 
Qn, Qn+1,---,4n4+1—1. Since { (@(x))?dx > 0, the value of / could be at most n — 1. 
Thus if, { ox* = 0, fork =0, 1,...,n—1, then J f(x) dx was exact for polynomials 
of degree < 2n—1. This meant that ¢(x) should be a constant multiple of the Legendre 
polynomial of degree n and Jacobi had succeeded in showing that orthogonality lay at 
the root of the Gaussian method of numerical integration. 


24.5 Murphy and Ivory: The Rodrigues Formula 


Robert Murphy’s discussion of orthogonal polynomials appeared in his two publi- 
cations of 1833 and 1835 on the inverse method of definite integrals, written in 
1832 and 1833, and in his 1833 treatise on physics.!> He considered the integral 
P(x) = he f @)t* dt and determined the form of the polynomial f(t) such that ¢(x) 
was zero for x = 0, 1, ...,n — 1. He let 


f(t) =14 Ayt + Agt? +--+ + Ant”, 
so that 
1 A A A P 
b(x) = et Pts ae (24.29) 
X+1 %x+2 x43 xtn+1 0 


where Q = (x + 1)(x +2)---(« +n +1) and P was a polynomial of degree at most 
n. To find an expression for f(t) when (x) = 0 for x = 0, 1,...,2 — 1, Murphy 
argued that P would have the form cx(x — 1)---(« —n +1). Thus, 


'5- Murphy (1833a), (1833c) and (1835). 
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1 Al A2 An cx(x — 1)---(«-n+1) 


x+1 6x42 5 x43 “ytnt+l @+tDa+2)--atnt+l 


Multiplying both sides by x + & and then setting x = —k fork = 1,...,n +1, he got 
the result 


ee ae _n n+l _ nin — 1) (n+ 1)(n+ 2) a 
Ee To ee sae ee | ea i 
Hence, 
fO=1 eae ea 1), GE 2) 
1 1 1-2 1-2 
qn Gy (1 nt + BOs) 72 )) 
~ arn [22 saen 
1 d” fi 
= t 
Poem): 


completing Murphy’s proof of the Rodrigues formula. 
Now Ivory’s proof involved differential equations. He showed that P;(x), the kth 


Legendre polynomial, satisfied the equation 


jd" d 


qrtl 
(k —n)(k +n +1) — x7) oe. (« yn FP) = 0. 


Ivory presented this result in his 1812 paper. Twelve years later, unaware of 
Rodrigues’s earlier work, he observed that by a repeated use of this equation, he could 
obtain the Rodrigues formula. He set 


qd” 


aaa Pr and go = Px, 


dn = (1 — x7)" 


from which he had 


1 d 
oo + kKka dx! = 0, 
1 d 


ERD) do 


p 


Thus, 


(—1)* dé ee ig 
= 1 Pele 
WT o.g.00k ae OF? : 
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Now, from (24.16), he could deduce HP =1-3-5.---(2k —1), and therefore he 
could write the required result, 


(21 yk dé 
el 
2:4-6---2k dxk 
Also mentioned in Chapter 25 on q-series, Olinde Rodrigues employed differential 
equations to obtain his formula in 1815; in fact, he referred to Ivory’s 1812 paper. 


Still, note that Rodrigues has priority in this matter, since Ivory did not work out his 
final result until 1824. 


Pr(x) = x?)k, 


24.6 Liouville’s Proof of the Rodrigues Formula 


In 1837, Ivory and Jacobi published a joint paper in Liouville’s journal, containing a 
proof of the Rodrigues formula, using Lagrange inversion.!© They were both unaware 
that the French mathematician Rodrigues had already published his result in 1815, 
albeit in an obscure journal. This interesting collaboration between Ivory and Jacobi 
took place at the suggestion of Jacobi, who wrote to Ivory that, since they had indepen- 
dently obtained the Rodrigues formula, they could publish a joint paper to broadcast 
this result in France, where it was unknown. In the same issue of his journal, Liouville 
published an alternate, more transparent proof,'’ in fact similar to one published by 
Jacobi almost ten years earlier. Liouville started by reproducing Legendre’s result that 


! 2 
dx = ——  3dmn, 
[mi x m+ mn 


where x,, denoted the Legendre polynomial of degree m. Liouville then observed that 
Xn was a polynomial of exact degree n, and hence any nth degree polynomial had to 


be a linear combination of the polynomials x9, x1, ...,n. He let y be any polynomial 
of degree n — 1, so that for some constants Ao, Ai, ...,An—1 
y = Ap + Ayx + Agxg + +++ + An—1Xn-1. 


From this, he had 


d"y =0, / yx,dx = 0. 


Since d” y = 0, repeated integration by parts yielded 


x x x et x pt ral 
iA yxy dt = yf X, dt — yf i} Xp dt, dt + yf / / Xp dto dt, dt 
-1 -1 -l1J-1 -lJ-lJ-1 


x t th—2 
teckepr tye? f / of Xn dty—1 dth-2--- dt. 
-1J-1 -1 


16 Ivory and Jacobi (1837). 
17 Liouville (1837a). 


24.6 Liouville’s Proof of the Rodrigues Formula 701 


Because the left-hand side was zero for x = 1, and y was an arbitrary polynomial 
of degree n — 1, for x = 1 he obtained 


[ara [fi manarao, 
[f- fo ae ee eter ee) 


Liouville denoted the polynomial of degree 2n in the last equation by ¢(x), or, 


x t th—2 
ow) = | / of Xn dtn—1 dty_2+++ dt. 
-1J-1 -1 


$0) = $I) = --- = 9 PD =0, 


Then 


implying that (x — 1)” was a factor of @(x). Since it was obvious that ¢(—1) = 
¢'(-1) =--- = ¢"-)(-1) = 0, he could conclude that (x + 1)” was also a factor; 
hence, (x? — 1)” was a factor of (x). Also, (x) was of degree 2n, so, clearly, 
(x) = D(x? — 1)” for a constant D. Therefore, for some constant Hy, 


n 


d 
X= Hy —— (x? = 1)" 


dx” 


d” n d d’—! 
= H, («« + 1)” aa ie ee (eae 1 eee 1)" 4 -). 


Observe that Liouville applied Leibniz’s formula for the nth derivative of a product. 
Now note that, except for the first, every term in this expression was zero at x = 1, so 
that he could write 


Xn) = 1- 2-34-02" Ap, 


Note also that, for x = 1, the generating function of the Legendre polynomials is 


1 
(l—2x¢+2?) 2 =(1-z) | =(1tz42°4---). 


So Liouville could conclude that x,(1) = 1, and H, = sh ; 
Liouville also proved that a function f(x) could be expanded in terms of xn.!8 He 
first set 


proving the result. 


F@ = pane f f()Xn dx. 


n=0 


18 Liouville (1837b). 
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Multiplying both sides by x, and integrating over (—1,1), he obtained 


1 
/ (F(x) — f(x))xXn dx = 0, 
-1 


and therefore 


1 
[ (Feo - roo)yar=0 


1 


for an arbitrary polynomial y. Liouville took y = x” to get 


1 
/ (F(x) — f(x))x" dx =0,n=0,1,2,.... 
-1 


He then concluded that f(x) = F(x) and the result was proved. 

To show that his conclusion was justified, Liouville also derived the additional 
theorem that if f(x) was continuous and finite on [a,b] and f x” f(x) dx = 0 for 
n= 0,1, 2,..., then f(x) = 0 on [a,b]. His proof applied only to those functions 
having a finite number of changes of sign in the interval [a,b], though he failed to 
remark on this. He began his proof by assuming the geometrically evident proposition 
that if f (+) was always nonnegative in [a, b] and fi i f(x) dx = 0, then f(x) had to be 
identically zero. We remark that Cauchy’s ideas on integrals and continuity from the 
1820s can be applied to provide an effective proof of this assumption. Next, Liouville 
supposed that f(x) changed sign at the values x1, x2,...,x, inside [a,b]. He let 
W(x) = (4—x1)(*—xX2) +++ (x —xy,), and noted that f(x) w(x) would have no changes 
of sign in [a,b] and that i w(x) f(x) dx = 0. He could conclude f(x)w(x) = 0 
and f(x) = 0, giving him the required result. A modern proof of the proposition 
might use the Weierstrass approximation theorem, but that was not stated until some 
decades later. Moreover, observe that ideas such as uniform convergence had not been 
discovered in Liouville’s time. In fact, he did not even clarify the type of interval 
he was working with and had to explicitly state that f(x) was finite, or bounded, 
at all points. 


24.7 The Jacobi Polynomials 


In his significant 1859 posthumously published paper, “Untersuchungen tiber die 
Differentialgleichung der hypergeometrischen Reihe,’!° edited by Heine, Jacobi 
used the hypergeometric differential equation to derive a Rodrigues-type formula 
for hypergeometric polynomials. These polynomials are now referred to as Jacobi 


polynomials, and Jacobi further showed them to be orthogonal with respect to 


19. Jacobi (1859). 
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the beta distribution. In this paper, Jacobi also obtained the generating function 
for Jacobi polynomials by an application of the Lagrange inversion formula. Note 
that Jacobi polynomials are in fact generalizations of Legendre polynomials. It is 
hard to determine exactly when Jacobi discovered these polynomials. In the 1840s, he 
published some papers on hypergeometric functions and related topics, but remarks of 
Kummer indicate that Jacobi had studied these functions even earlier than that. Jacobi 
started his investigations with the observation that if y satisfied the hypergeometric 
differential equation, then 


x( x)y (c—(a+b+1)x)y' — aby =0, 
x —x)yO + (c+1—-@+b+3)x)y% —(a+Db+Dy' =0, 
x(1—x)y + (c+2—(@+b4+5)x)y® — (a+ 2)(b +2)y% =0, 


4 =ayr NS (ctn-1—(a+b+2n 1)x)y™ 
(atn—1)(b+n—1)y") =0. (24.30) 


To understand why these equations follow one after the other, note that, in Gauss’s 
notation, if 


b 
y = F(a,b,c,x), then ya(é )ra Lbs Leta e) 
Cc 


. : : a,b 
In more modern notation, one might write y as 2F\ ( ) . The parameters 
Cc 


a, b, c change toa + 1,b+ 1, c+ 1, respectively, when one takes the derivative of a 
hypergeometric function. Following Jacobi, multiply (24.30) by 


eam 2 gore eet 


and rewrite as 


("a — x)"My”) =(at+n—1)(b4+n Dx" — x)"  My@-D, 
x 


where M = x°—!(1 — x)¢+°~°, By iteration, he had 
n 


dx” 


(x"a — x)"My”) —a(a+1)---tn—Dbb+1)---b+n—1)My. 


Next Jacobi took b = —n, so that y = F(—n,a,c,x) would be a polynomial of 
degree n; then y was a constant and the equation became 
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x1 ee kee qd” 
F(— _ ctn-1 1—x)*~). 
SMa ee hinerm=ide 


Replacing a by a +n, he obtained the Rodrigues-type formula 


: aie a — x)°-4 qd” 
= F(— _ ctn-1 i= a+n—c . 
n = F(—n,a+n,c,x) Hey Gta Dae (1 — x) ] 


Jacobi used the Lagrange inversion formula to find the generating function of the 
polynomials X,,; for € = 1 — 2x and (c)» denoting the shifted factorial: 


ylPei=.ajee (1 Lai = oe we) (1 Lis Jf—ones me)” 
(2hye-)/1 —2he +2 


He then used the hypergeometric differential equation to prove the orthogonality 
relation for X, when c > 0 anda+1-—c > 0. Observe that the latter conditions were 
necessary for the convergence of the integrals. He then let 


1 
Tits =i RNa Say de 
0 


Since X;, satisfied the differential equation 


x(1 I, +t (c — (a4 1)x)X), = —n(n +a) Xn, 


he could deduce that 
: d 1 / 
—n(n+a)Jnn = [ Xm D1 Sei Xan 
I ad 
= / Xn DC — x)@t-¢X! | dx = —m(m +4) Jm,n- 
0 


Thus, taking m # n, he had Jj), = 0. When m = n, then integration by parts 
yielded 


1 
n(n + a)Jn.n =i xX), xX, xo (1 — x)ttl-dy, 
0 


Since X/, and X’, were again hypergeometric polynomials, this relation implied 
that 


: 1 
(n—atat vf io. iran 0 ems) md Eee / XU XM etl — yyttP—egy, 
0 0 
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Now X @) was a constant so a repeated application of this formula finally produced a 
beta integral, computable in terms of gamma functions. The eventual result was then 


nt (T(c))*T(a—c+n4+l1) 
a+2n T(atnl(e+tn) © 


Jnin _ 


The polynomials X,, are the Jacobi polynomials, except for a constant factor. In more 
modern notation, taking c = a + 1 anda = a+ 6 + 1 in the expression for X,, the 
Jacobi polynomials may be expressed as 


POPE): = a natatptletl, =) 
nN. 


2 


Pr ae ad OU 


n+a n+p 
> ey 


=(-)) 


Thus, observe that the orthogonality relation would hold over [—1, 1] with respect 
to the beta distribution (1 — x)“(1 + x)8 for a, 8 > —1. Also we note that it can be 
shown, by using the hypergeometric differential equation, that the Jacobi polynomial 
poe ) (x) is a solution of the differential equation 


(l—x*)y"+(a@-B+@+B+1)x)y+n(ntat+fh+l)y=0. (24.31) 
Jacobi briefly noted that for x = a 


CO 
(1 —2hg +h?) = Do h"Y,, (24.32) 
n=0 


where 


: een ( eS —) 


n! Cae is 2 
= A¥o(o bl) etn = 1) [x(1 — x)]2(1-20) 
~ (2e-+n)Qe+n+1)---(2c+2n—1) a 
nN 
+ (2c4+2n—1) 
* qe =e ie ns 


We now designate the Y;, as ultraspherical or Gegenbauer polynomials: 


(2c)n 


(--$.e-4) 
Pr ’ 
Ens (§) 


Yn i= Cr (é) = 


where (a), denotes the shifted factorial a(a + 1)---(a+n— 1). Gegenbauer polyno- 
mials,”? named after the Austrian mathematician Leopold Gegenbauer (1849-1903), 
student of Weierstrass and Kronecker, are special cases of Jacobi polynomials. 


20 Gegenbauer (1884). 
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They occur when the parameters a and f are equal, and they are of great independent 
interest. Note that the generating function for the ultraspherical polynomials (24.32) 
is different from the generating function obtained from the one for the Jacobi 
polynomials. 


24.8 Stieltjes: Zeros of Jacobi Polynomials 


In an 1886 paper published in Acta Mathematica,”! Stieltjes presented an interesting 
interpretation of the zeros of Jacobi polynomials by viewing them in terms of the 
positions of some point masses distributed on [—1, 1], attracted to one another with a 
force inverse to the distance between them. 

Thomas Jan Stieltjes (1856-1894) entered the Polytechnical School in Delft in 
1873. He spent nearly all his time studying the works of Gauss and Jacobi, as a result 
of which he was unable to pass his examinations in spite of repeated attempts. Thus, 
he found himself unable to find a suitable position in the Netherlands, except as a 
calculator at an observatory. In 1882, he began a correspondence with Hermite, by 
whose assistance in 1886 he defended a thesis on asymptotic series at the Sorbonne 
and eventually obtained a professorship at Toulouse in France. 

In 1894, Stieltjes published his famous paper?” on the convergence of continued 
fractions of the form 


1 1 1 1 
ayzZ+ a2zZ+ a3z+ agz 


-, GQ, >0, n=1,2,3,..., 2EC. 


This groundbreaking paper devotes one chapter to the Stieltjes integral. Since we 
will utilize this integral repeatedly in this book, we present Stieltjes’s definition at the 
end of this section. The paper contained many important results,”> as Poincaré’s praise 
of the work” indicates: 


Therefore Stieltjes’ work is one of the most remarkable Memoires in Analysis which have 
been written in the past years; it adds to many others which have placed their author in an 
eminent rank within the science of our period.... The committee takes pride in proposing the 
Academy award Mr. Stieltjes the highest evidence of his approval by ordering the insertion of his 
Memoire “Sur les fractions continues” into the Collection of foreign Scholars (in the Academy) 
and the committee expresses the wish that a prize could be awarded him from the Lecomte 
foundation. 


In his 1886 Acta Mathematica paper,”> Stieltjes took masses of mass b fixed at —1 
and of mass a fixed at 1, along with unit masses located at xj > x2 > +++ > Xp. 


21 Stieltjes (1886b), especially pp. 387-388. 

22 Stieltjes (1894). 

23 See W. Van Assche in Stieltjes (1993) vol. I, pp. 6-11. 
24 Stieltjes (1993) vol. I, p. 4. 

25 Stieltjes (1886b). 
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He assumed that the masses were in equilibrium, meaning that the total force on the 


unit mass at each point x;, i = 1,2,...,, was zero: 
a b 1 ; 
| =0, i=1,2,...,n. (24.33) 
1+; 1-— x; iy Xi — Xj 


Now in a paper of 1885,76 Stieltjes had already shown that a condition more general 
than (24.33) would imply that the polynomial 


y= f(x) = (& — x1) (% — x2) +++ (& — Xn) (24.34) 


satisfied a second-order linear differential equation. Thus, Stieltjes could conclude?’ 
from (24.33) that y = f(x) would satisfy the equation 


(l— x”) y” + 2(¢ -—b-(a+b)x) y' +Cy =0, (24.35) 


where C = n(n + 2a + 2b — 1). But to prove (24.35), following Stieltjes’s method of 
1885,” observe that the logarithmic derivative of (24.34) produces 


y’ 1 1 1 1 1 
= Sorat eee ; (24.36) 


= T T T 
y xX — Xj x—X] X—Xj-1 xX —Xj41 X— Xp 


next, following Stieltjes, find the limit of (24.36) as x — x; by an application of 
L’H6pital’s rule, to get 


1 
1 
. Se = (24.37) 
2y'(i) at i Hj 
Combine (24.35) and (24.37) to get 
1 y" (x; b 
ee =a a eee 
2y(~j) l+txy 1-%x; 
or 
(1 — x?) y"(xj)) + 2(a — b — (a +b)x;) yj) =0, i =1,...,0. 
Now since y = f(x) is a polynomial of degree n, the expression 
(1 — x?) y"(x) + 2(a — b— a + b)x) y'@) 
is a polynomial of degree < n and is equal to zero for x = xj, i = 1,2,...,n; thus it 


is a constant multiple of y and, finally, y satisfies the differential equation 


26 Stieltjes (1885a). 
27 Stieltjes (1886b). 
28 Stieltjes (1885a) p. 324. 
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(1 —x?) y”+2(a—b —(a+b)x) y' +Cy =0. (24.38) 


As for the constant C, Stieltjes found it by equating the coefficient of x” in (24.38): 


n(n—1)—2(a+b)n+C=0 
or 
C =n(n+2a+2b—1). 


Now if we set a = att and b = ae then (24.38) becomes 


(l—x*)y"+(a-B+@t+Bt+l)x)y tnmtatp+ly=0. 
Comparison with (24.31) shows that the polynomial 
fF) = (& — x1) — X42) +++ — Xn) 


is in fact the Jacobi polynomial pee ) (x), except for a constant factor. Thus Stieltjes 

had actually proved: Let 1 > x1 > x2 > --- > X, > —1 and let x; satisfy the 

conditions given in (24.33); then the x;,i = 1,2,...,n, are the zeros of the Jacobi 
; (2a—1,2b-1) 

polynomial P, (x). 

Also note that we can reformulate this theorem of Stieltjes in terms of the potential 


energy of the system with mass b fixed at —1, mass a fixed at | and unit masses at 


XxX, > x2 > +++ > Xn, between —1 and 1. The logarithmic potential energy can be 
given as 
n n 
T (x1,%2,...,X_) = -a)> Injl + x;| — by? In|l — x;| — = In |x; — x;l. 
i=l i=l l<i<j<n 
(24.39) 
The system is in equilibrium when aT = 0,i = 1,2, ...,m, equations identical with 


(24.33). Thus, we can say that when the logarithmic potential energy (24.39) is at a 
minimum, then x1,x2,...,, are the zeros of the Jacobi polynomial pete») (x). 
Turning to Stieltjes’s definition of his integral,2? he began with an increasing 
function @(x) on an interval [a,b]. He showed that ¢ must have a left- and right- 
hand limit at each x € (a,b), denoting these limits by ¢~ (x) and ¢* (x) respectively. 
He observed that there would be a jump discontinuity of d* (x) — @ (x) at x when 
ot (x) 4 @ (x). Since the sum of the jump discontinuities in [a,b] had to be less 
than ¢(b) — ¢(a), he was able to prove that the number of points of discontinuity for 
@ was countable. But the number of points in [a,b] was uncountable, so that every 
subinterval of [a,b], no matter how small, had to contain points of continuity of ¢. 


29 Stieltjes (1993) vol. 2, pp. 665-668. 
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He next considered the set of points 
a=xo9 <x) <xX2 << Xp] < Xn, =D 


and took numbers &, k = 1,2,...,n such that x,_1 < & < xx. Stieltjes then wrote 
that the limit, if it existed, of the sum 


Ff (E1)($ (x1) — (x0) + f E2)(O 2) — 6(1)) ++ + f En)(OGn) — OCn-1)) 


was denoted by 


b 
i fu) dptu). (24.40) 
a 
His clear meaning here was that the limit, as a norm ||P|| = max {|x; — x;-1]|, 
i = 1,...,n} of the partition P(x0,x1,...,%n), tended to zero. Note that in his 


definition of the integral, Stieltjes did not use the upper and lower sums, considered 
by Poincaré’s thesis advisor, Gaston Darboux.*° 

Finally, Stieltjes noted that if f was continuous on [a, b], then the integral (24.40) 
would exist. Luxemburg has remarked?! that the Stieltjes integral went unnoticed 
by mathematicians until 1909 when Frigyes Riesz employed it to state his famous 
theorem on continuous linear functionals on the space of continuous functions [a, b.*2 


24.9 Askey: Discriminant of Jacobi Polynomials 


In 1983-84, Richard Askey observed that he could find the discriminant of the Jacobi 
polynomial by a new method and he presented this derivation in his lectures. First, 
take f(x) to be a polynomial of degree n such that 


F(X) = agx” + ayx™) + +++ + ay = ag(x — x1) (4 — x2) +++ (% — mp). 


The discriminant D,(f) of f(x) is defined by 
D,(fysap [| Gea. (24.41) 
l<i<j<n 


Now observe that the condition (24.39), where a > 0, b > O, implies that the 
maximum of 


n 
H(x1,x2,-...%n) =] xfa—4i)? YT] bi -2il (24.42) 
i=l l<i<j<n 
occurs when x),X2,...,X, are the zeros of the Jacobi polynomial (with change of 


variables) 


30 Darboux (1875). 
31 Stieltjes (1993) vol. I, pp. 60-65. 
32 Riesz (1909). For a more modern treatment, see Douglas (1972) pp. 18-20. 
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Askey perceived that by maximizing (24.42) and using the Selberg integral, he could 
arrive at a new derivation for the discriminant of the Jacobi polynomial. For this 
purpose, he used a result contained in the book of his colleague Walter Rudin:** 

Let yz be a positive measure on a measure space X; let 


IIflle = (firitan): ae 


for some k in0 < k < ov, with || f ||. > 0. Then 
File > Ilflloo as k > o. 
Apply Rudin’s result to the situation 
dy = dx\dx2-++dxn, X = [0,1]’, 
n 
F@um smd =] [d= [] Gx) (24.43) 

i=l 1<i<j<n 

and observe that since f is continuous on [0, 1]”, 


Ilflloo = max f(x). 


xe€[0, 1]” 


Askey next took Selberg’s integral formula (17.127) 


[T)tte-a [] bi 231?" du 
Xx. 


_ I P(a+(j—Dy)P(6+( — Dy)PU + jy) 
T(a+f6+(n+j-—2y)rd+y) 


j=! 


in which he set a — 1 = 2ak, 6 — 1 = 2bk, and y = k to obtain 


(| fkd = Tl P(Qa+ j—-Dk+1)P(2b4+ 7 -Dk+1)PGK+)) 
a ees T(Qa+2b+nt+j—2Dk+2rK+1) 


j=l 


He next let k — oo and applied Stirling’s approximation to arrive at 


Tl Outs (et Obs yg = 10 acai ama 2) 


(2a+2b+n-+ j —2)2a+2b+n+j-2 ” (24.44) 


max x)= 
xe€[0, 1]” F( ) 


33 Rudin (1966) p. 70. 
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implying that if x1,x2,...,x, were zeros of the Jacobi polynomial peed = 2x), 
then 


; n (2a an - (yl Obs 7 ps 1)2>+s-1 ji 
Il (xi = £7) =|] 247, _ ».\2b + 9)2a+2b+n+ j—-2° 
x; (= 29)" Ga 2b nt J — 2) J 
(24.45) 


l<i<j<n j=l 


But since 


2a +1 + 2a + 2b 4 


n! 2a 1 
1)"(2+2b4+n+1) 
aes = 7 (x — x1)( — x2)-++ (% — xn), 
(24.46) 
Askey could write 
n 2a 
T] =7"= ( Ge Ws ) (24.47) 
pa J (Qa+2b+n+1)p 


and 


T]a aap = (cEDeat Dag, (in 2a +2641, a 
aga p ay), 2a+1 


j=l 
(24.48) 


Moreover, he could sum the »F; in (24.48) by the Vandermonde, or 
Chu-—Vandermonde, identity discussed in Section 25.10. Thus, he found that 


ji _ y.y2b (2b + 1)n 2b 
Ile w -(_ >) (24.49) 


Using (24.46), the value of ag in (24.41) must be CD" Cabernet De | Thus, by 
calling upon (24.45), (24.47), and (24.49), and replacing 2a — 1,2b—1 by a, B, Askey 
finally got 


n 
D (PPG) =27 J] 7 OP +e) (CG BY (@setetay 
j-l 


the discriminant of a Jacobi polynomial that was apparently first discovered by 
Stieltjes.*4 


34 Stieltjes (1885b). Also see Szegé (1975) pp. 142-143. 
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24.10 Chebyshev: Discrete Orthogonal Polynomials 


P. L. Chebyshev introduced discrete orthogonal polynomials into mathematics in his 
1855 article, “Sur les fractions continues.”°> His work in this area, like many of 
his other efforts, was motivated by practical problems for which he sought effective 
solutions. Chebyshev made frequent use of orthogonal polynomials, continuous as 
well as discrete, and he was probably the first mathematician to emphasize their 
importance and applicability to problems in both pure and applied mathematics. 
Chebyshev was greatly influenced in this connection by the papers of Gauss and 
Jacobi on numerical integration. Chebyshev studied at Moscow University from 1837 
to 1841 where N. D. Brashman instructed him in practical mechanics, motivating some 
of Chebyshev’s later work. In 1846, Chebyshev wrote a master’s thesis on a topic in 
probability; this subject also became his lifelong interest. Chebyshev’s 1855 paper laid 
the foundation for his work on orthogonal polynomials. In presenting his work, we at 
times follow the notation given by N. I. Akhiezer in his article on Chebyshev’s work. 
This notation more clearly reveals the dependence of certain quantities on the given 
variables. 

Chebyshev began his paper by stating the problem in rather general and vague 
terms: Suppose F(x) is approximately known for n+ 1 values x = x0, X1,...,Xn, and 
that F(x) can be represented by a polynomial of degree m < n, 


OE OR Hox ao Bex he 


Find the value of F(x) at x = X so that the errors in F(xo), F(x1),...,F (xn) 
have minimal influence on F(X). From a practical standpoint, the problem makes 
good sense. For example, the values of some function y = F(x) may be obtained by 
observation for x = xg, X1,...,Xy,. These values would have experimental errors so 
that y; ~ F(x;). Thus, F(x) is a polynomial of degree m < n, and the problem is to 
determine F(x) in such a way that the errors of observation have the least influence. 
In more specific terms, Chebyshev stated the problem: 

Find a polynomial F(x) of the form 


F(x) = wodo(x) yo + MAL (aX) yt + + Mn dn (®) yn, (24.50) 


where A;(x) are unknown polynomials of degree < m and yj; > O are weights 
associated with observed values y; subject to the following two conditions: The 
identity 


F(X) = ModAv(X) fA) + MALO) SOD) + + Mn dn XO) fF On) (24.51) 


must hold for any polynomial of degree at most m; and one must minimize the sum 


W(X) = juo(A0(X))” + pei (A1(X))> + + Hn (An (XD) (24.52) 


35 Chebyshev (1899-1907) vol. 1, pp. 203-230. 
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Thus, W(X) had to be minimized with respect to the constraints of (24.51), 
equivalent to the m + 1 conditions 


X* = podo(X)xp + iAr(X)xp ++ + Mn dn xh, (24.53) 


for k = 0, 1,...,m. Then Chebyshev applied the method of Lagrange multipliers 
with Ap, Ay, ...,An as the variables and with /9(X), 1;(X), ...,lm(X) as the m + 1 
multipliers. This gave him the n + 1 relations 


ow e@a4g 
~—) le(X) (uo dox*® +++ dax* — X*) =0, 
aaj ve a k( )(10 0X9 + + MndAnXy ) 
or 
20g (X) = Io(X) + X)xi +++ +l (KX, (24.54) 


fori = 0, 1,...,m-+ 1. Chebyshev wrote that the whole difficulty boiled down to 
solving this system of equations. He denoted the polynomial on the right-hand side of 
equation (24.54) by 2K,,(X,x;), obtaining 


1 m 
Km (Xx) = 5 Sol (X)x*. (24.55) 
k=0 
Thus, Chebyshev’s problem was to find an expression for A;(X) = Km(X,xi), 
i= 0, 1,...,n. Note that the constraints (24.53) could be written as 
n 

Se Kn Oat a: es 0) Liki (24.56) 

i=0 


These relations implied that the polynomials K,,(X,x;) should be such that, for 
some function A(X), 


5 HR Kai) 1 A(X) 


= sean 24.57 
é x —X; x—-X xm ( ) 
i=0 
Note that this relation could be rewritten as 
n 
bi 1 A) 
cae? Seer — N(Xx) - —> = Sap te (24.58) 


with N(X,x) a polynomial of degree m — 1 in x. Of course, this was made possible 
by the elementary relation that if g(x) is a polynomial of degree m, then there is a 
polynomial h(x) of degree m — 1 such that 


g(x eee 8 (Xi) 
X—xX 


i 


LH 
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Here Chebyshev considered an additional but related problem: Find a polynomial 
Wm (x) of degree m and a polynomial 7, (x) of degree at most m — 1 so that 


n : I 
Yn (x) ¥> —“!— — am (x) = O (az): (24.59) 
i=0 


SA ay 
Chebyshev’s study of Gauss’s paper on numerical integration showed him that the 
answer lay in the continued fraction expansion 


n 


1 1 1 1 


y= ae (24.60) 
ee a Co 
where gm = AmX + Bm were linear functions form = 1, 2,.... In fact, the mth 
Tm (x) 


(m < n-+ 1) convergent was the rational function 
required in (24.59). Chebyshev proved that 


im Vn+1(%) Wn i) — Yn (&) Ym+1 i) 


X Xj 


Tnx)’ producing the polynomials 


Ai(x) = Km (x, x3) = (-1) 


(24.61) 


He then derived another relation for A; (x) by using the three-term relation for yy, (x) 
obtained from the continued fraction 


Wit (©) = dm4i Wm (X) + Wm—-1) 
= (Am4i1x + Bn4i) Wn (®) + Wn-1(%). 


When this was substituted in (24.61) he could obtain, after simplification, 


Win ®)Wimn—1 (Xi) — ent) 


Dee, 


(—D™Ai (x) = (4m Vin o¥in (8 


Repeating this process m times, he got 


(10a) = SC 1)" J Apity@nyy@ 
j=0 
~ Win ©) Win i) — Win (*) Win Oi) 


xX — Xj; 


(24.62) 


This important relation is usually called the Christofell-Darboux formula; they 
obtained it in a similar way, but Chebyshev published the formula more than a decade 
earlier. When he substituted the value of A; (x) in (24.62) in (24.50), Chebyshev had 


n 


F(x) = 0 | OCD Ajit ivy) | ei FO). (24.63) 


i=0 \j=0 


He then set F(x) = W(x) and equated the coefficients of w(x) on both sides to 
obtain the orthogonality relation 
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(—1)* Ans ¥ > mia (xi) Wm (1) = dkms (24.64) 
i=0 


and in particular 
anes 
Dio Hite Oi) 


where 7%;(x) was the sequence of discrete orthogonal polynomials. So his final result 
was expressed as 


Ati = (24.65) 


eo HEV (xi) We) 
pa Uk We (xi) 
Chebyshev concluded his paper by stating and proving two results on least squares. 

For the first result, he supposed V to be a polynomial of degree m with the coefficient 


of x” the same as that of W(x). He then showed that the sum )7"_9 4 V?(x;) had 
the least value when V = w(x). To prove this, Chebyshev set 


Ai(x) = (24.66) 


V = AoWo(x) +--+ + Am-1Wm—-1(%) + Wn (X) 
and then 
Swi V7(x1) = Swi (AoWolui) + -* + Am—1Vim—1 i) + Yin (id)? 
i=0 i=0 


For a minimum, the derivatives with respect to the A; should be zero. Thus, 


2 >> wit (x1)(AoWo(ai) + Am—1Wim—1 (ai) + Vn (Xi) = 0, f =0,....m— 1. 
i=0 


An application of the orthogonality relation (24.64) gave 


n 
Ajo miv70u) =0, fj =0,1,....m—1. 
i=0 


This implied A; = 0, 7 = 0, 1,...,m — 1, and hence V = W(x). In his second 
result, Chebyshev proved that 


2 


>a | Fo) — OAs @) (24.67) 
i=0 j=0 


was a minimum when 


ino Hit 01) FO) 
Dino Mi; Oi) 


Aj= 
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He once more took the derivatives of (24.67) with respect to Aj, j = 0,...,m to 
obtain 


250 ibs (i) F(xi) — )0 Ajyj a) = Op: 70) Ty nw ests 


i=0 j=0 
Again using the orthogonality relation (24.64), these equations reduced to 
n n 
Yo wiv Gi) F(a) — Aj D> wi¥7(i) =0, jf =0,1,...,m, 
i=0 i=0 


implying the required result. 


24.11 Chebyshev and Orthogonal Matrices 


In his 1855 paper, before the theory of matrices was formally developed, Chebyshev 
gave a very interesting construction of an orthogonal matrix, noting that in a paper of 
1771, Euler also constructed such squares.*° However, after 1855, Chebyshev did not 
develop this topic further. Chebyshev defined the function 


Ox (xj) = Jaiwe(xi), i,k =0,1,...,n 
where 


Perens eae 
Vino Mi¥7 i) 


He then considered the square tableau 


Qj 


Po(xo) Pol) +++ Pon) 
Pi (xo) Pity) ++ Pin) 

; ; (24.68) 
Di(x0)  Pn(X1) +++ Py(%). 


From the orthogonality relation (24.64), he deduced that the sum of the squares of 
the terms in each row and in each column was one. Also, in any two rows or columns, 
the sum of the products of their corresponding terms would be zero. 


24.12 Chebyshev’s Discrete Legendre and Jacobi Polynomials 
In his 1864 paper “Sur l’interpolation,’ Chebyshev took 4; = 1 and x; = i for 
i = 0,1,...,n — 1 with pw as already defined.?” See equations (24.59) and (24.60). 


36 Bu. I-6 pp. 287-315. E 407. 
37 Chebyshev (1899-1907) vol. 1, pp. 541-560. 
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Then the polynomials Wo(x), W(x), W2(x), ... were the denominators in the contin- 
ued fraction expansion of 


1 1 1 1 


x x—-l x-2 Ee 


By (24.64), these in turn satisfied the relations 


So Wi)vm() =0 for m<l. (24.69) 


i=0 


Chebyshev found a Rodrigues-type formula for w(x), where the differential oper- 
ator was replaced by the finite difference operator. His two-step approach was exactly 
the discrete analog of the method employed by Jacobi in his paper on numerical 
integration. In the first step, Chebyshev proved that if there was a polynomial f(x) 
of degree / such that 


Y\f@OVm@ =0 for m <1, 


i=0 


then there existed a constant C such that f(x) = Cy(x). For the second step, he 
showed that the polynomial of degree / given by 


f(x) = Alx(x — 1)-+- (x 1+ 1)(« —n)\(x —n—-1)::-(«-—n—-141) 


satisfied the required condition. Thus, he had the Rodrigues-type formula w(x) = 
C, f (x), where C; was a constant. 

Chebyshev also gave an interesting interpolation formula in terms of W(x). He 
supposed ug, u1,...,Un—1 to be the values of a function u at x = 0, 1,...,2 —1. The 
interpolation formula would then be expressed as 

yi 8G + Di - 1D Au; 


n 12n(n2 — 17) 


n—-1,. f | ‘ : Oe, 
| sy G@+)NG@+2)m—i-1)(n—-i — 2)A‘u; 
i=0 


Ax(x — n) 


(2!)?2n(n2 — 12)(n2 — 2?) 


CIES NAW n= Dabeses (24.70) 


Interestingly, in an 1858 paper, “Sur une nouvelle série,”28 Chebyshev took the 
points as x1 = h, x2 = 2h, ..., X, = nh such that the orthogonal polynomials could 
be expressed in the form 


Wi(x) = CA! (x —h)(x — 2h)---(x —lh)(x —nh —h)---(x —nh-— Ih). 


38 Chebyshev (1858). 
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The formula corresponding to (24.70) would then be written 


1 3° i(n — i) Au; 
u=— duis Penn? — 15k A(x — h)(x —nh —h) 


Lid +Ym—i(n—i —YAru; 
(2! )2n(n2 — 12)(n2 — 22)h4 
x A(x — h)(x — 2h)(x — nh —h)(x —nh —2h)+---. (24.71) 


Chebyshev observed that if he set h = ‘ in his interpolation formula (24.71) and let 
n — ©, he obtained a series in terms of Legendre polynomials. On the other hand, 
if he seth = = and let n — ov, he arrived at the Maclaurin series expansion! At the 
end of the paper, Chebyshev made the insightful remark that one might use discrete 
orthogonal polynomials to approximate the sum )~?_, F(ih), just as Gauss had used 
Legendre polynomials for numerical integration. 

Chebyshev appears to have independently discovered the Jacobi polynomials.” He 
gave the generating function and proved orthogonality for these polynomials in an 
1870 paper based on the work of Legendre. Chebyshev there proved that if 


(lts+wv1—2sx+s2)-"%(1—s4+v¥1 —2sx + 52)-# 
V1 —2sx + x? 


F(s,x)= 


[o.@) 
= TGs" 
n=0 


then 
1 
/ F(s,x)F(t,x)Q —x)*(1+ x)” dx 
= 


was purely a function of st. This gave the orthogonality of 7,(x) with respect 
to the beta distribution (1 — x)“(1 + x)”. Note that the 7,,(x) are the Jacobi 
polynomials, generalizing the Legendre polynomials. On the basis of this work, in 
1875 Chebyshev defined the discrete Jacobi polynomials. He also showed that his 
interpolation formulas could be applied to problems in ballistics. 


24.13. Exercises 
(1) Show that 


1 Gi... ne, ee 
PAS Page yea w= He 


39 Chebyshev (1899-1907) vol. 2, pp. 1-8. 
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Show also that the denominators of the convergents of the continued fractions 
are cos @, cos 2¢, ..., where x = cos g. See Chebyshev (1899-1907) vol. 1, 
pp. 501-508, especially p. 502. 


Let d be the greatest integer in 5, where n is a positive integer. Show that the 
ultraspherical polynomials C* satisfy the relation 


(2 


wm 


fee. > nna — wa(n + pw — 2k) 


kl (wu + I)n—k 


Cron (x), 
k=0 


where for any quantity a, (a); denotes the shifted factorial a(a + 1)---(a+ 
k — 1). This is known as the Gegenbauer—Hua formula. See Gegenbauer 
(1884). 


(3) Show that 


= a 1 1 
[Sa- e* 1 1 oy 32 
~ aa — x+5 x+7 x+9 

de OOF, x+1—x+3 + 5 “ 


Denote the mth convergent by ae 2. Then show that 


xfi (x) =nfp(x) — 7 fr_10), 
fn+i(x) = («+ 2n + 1) fax) — 0? fr—1 0), 


0 
i e f(x) fn (x) dx = (0!) Sinn. 


See Laguerre (1972) vol. 1, pp. 431-435. 
(4) Show that 


1 dn-k (x? i)’ i i= 1)* dntk (x? i)" 
(n — k)! dxn-k (n+k)! dxntk : 
See Rodrigues (1816). 
(5) Show that if z = cos x, then 


2i-1 


OCS. sin ix 


= (1 3653 1) 


dzi-! 

iqe 2) #1 : 

as ae —dz = (-1)'"!3-5--- (21 — I cosixdx. 
Zz 


See Jacobi (1969) vol. 6, pp. 90-91. 
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24.14 Notes on the Literature 


Goldstine’s (1977) very interesting book on the history of numerical analysis gives 
a thorough and readable account of Gauss’s work on numerical integration; see 
pp. 224-232. Liouville’s Journal, founded in 1836 and the second-oldest mathematics 
journal after Crelle’s Journal, gave Liouville the opportunity to review many papers 
before they appeared in print, and then react to them. For example, the paper of Ivory 
and Jacobi stimulated Liouville to write his two short notes (1837a) and (1837b) in 
the same volume of his journal. Altmann and Ortiz (2005) is completely devoted to 
the work of Rodrigues in and outside of mathematics; see particularly the articles by 
Grattan-Guinness and Askey. 

See N. I. Akhiezer’s readable and insightful commentary on Chebyshev’s work 
on continued fractions in Kolmogorov and Yushkevich (1998). Steffens (2006) 
gives a fairly comprehensive history of Chebyshev and his students’ contributions 
to orthogonal polynomials and approximation theory. Nevai (1990) is a collection 
of interesting papers on orthogonal polynomials and their numerous applications. 
Roy (1993) gives more details on Chebyshev’s work in orthogonal polynomials. 
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